1
0
Fork 0
mirror of https://github.com/vale981/ray synced 2025-03-27 21:46:43 -04:00
Commit graph

13360 commits

Author SHA1 Message Date
Simon Mo
353d7e107f
[Serve] Improve Serialization () 2020-03-29 14:57:19 -07:00
mehrdadn
fc23f79f82
Windows process issues () 2020-03-29 12:48:32 -07:00
fangfengbin
6ce8b63bb6
fix TestTaskLeaseRenewal test failure () 2020-03-29 11:18:47 +08:00
Edward Oakes
d87563937e
Revert "[Dashboard] Metrics Export Service. ()" () 2020-03-28 19:27:34 -07:00
Eric Liang
d6255c3395
Fix build breakage due to soft torch import () 2020-03-28 19:08:31 -07:00
Sven Mika
e4bd5db4d8
[RLlib] Minimal ParamNoise PR. () 2020-03-28 16:16:30 -07:00
Eric Liang
5cebee68d6
[rllib] Add scaling guide to documentation, improve bandit docs ()
* update

* reword

* update

* ms

* multi node sgd

* reorder

* improve bandit docs

* contrib

* update

* ref

* improve refs

* fix build

* add pillow dep

* add pil

* update pil

* pillow

* remove false
2020-03-27 22:05:43 -07:00
Maksim Smolin
7b27ce2b23
[RaySGD] Convert the head worker to a local model ()
Why are these changes needed?

Running a worker on head (locally, not as a Ray actor) allows for easier handling of stateful stuff like logging and for easier debugging.
2020-03-27 20:19:15 -07:00
Richard Liaw
875309fc48
Revert wide docs () 2020-03-27 17:46:08 -07:00
Richard Liaw
e10dc91821
Fix doc build () 2020-03-27 17:39:38 -07:00
Mitchell Stern
090a8474b0
[Dashboard] Update dependencies and add linting rules () 2020-03-27 16:53:49 -07:00
Carl Balmer
0cfb6488a7
changed get_agent_class to from get_trainable_cls () 2020-03-27 12:17:16 -07:00
Simon Mo
838c1e854f
Add results from 0.8.3 release () 2020-03-27 11:14:15 -07:00
SangBin Cho
86e19959a5
[Dashboard] Tune dashboard bug fix ()
* Figured out why Tune was unavailable.

* Minor fix.
2020-03-27 09:02:30 -07:00
Kai Yang
6a3503c494
Fix reusing the cached hash of nil ID () 2020-03-27 23:40:03 +08:00
SongGuyang
c195dc8f88
Basic C++ worker implementation () 2020-03-27 23:01:08 +08:00
Sven Mika
93b5c38b7d
[RLlib] Noisy layers in DQN throw different errors (issue ). ()
* Rollback.

* Fix issue 7635.

* Fix issue 7635.

* LINT and bug fix.
2020-03-26 22:08:34 -07:00
Sven Mika
369a3417c4
[RLlib] Add tf-graph by default when doing Policy.export_model(). ()
* Rollback.

* WIP.

* WIP.

* Fix.

* LINT.
2020-03-26 22:07:10 -07:00
SangBin Cho
7a0befb0a7
[Dashboard] Metrics Export Service. () 2020-03-26 14:03:00 -07:00
hhoke
af3a5705ca
--redis-address -> --address ()
Exception tells user to use --redis-address, but it deprecated. This tells the user to use the current --address.
2020-03-26 13:52:39 -07:00
Saurabh Gupta
6ddf84b019
Contextual Bandit algorithms (WIP) () 2020-03-26 13:41:16 -07:00
Cloud Han
c1b05b720d
calling register_custom_serializer require ray to be initialized () 2020-03-26 10:24:06 -07:00
Sven Mika
bcf963a53b
[RLlib] Bug default policy overrides torch policy. ()
* Rollback.

* Bug fix!
2020-03-26 10:03:20 -07:00
fangfengbin
e196fcdbaf
Add gcs_service_enabled function to avoid getting environment variable directly () 2020-03-26 22:02:53 +08:00
Richard Liaw
ca6eabc9cb
[tune] Fail Fast ()
* pytest

* init cancel

* testing

* Update python/ray/tune/tests/test_tune_server.py

Co-Authored-By: Richard Liaw <rliaw@berkeley.edu>

* change-test

* Apply suggestions from code review

* Apply suggestions from code review

* finished

* set_finished

* tune

* fix

Co-authored-by: ijrsvt <ian.rodney@gmail.com>
2020-03-26 00:04:09 -07:00
hubcity
3d0a8662b3
- Fixing broken links ()
*  - Fixing broken links

* Apply suggestions from code review

Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
2020-03-25 21:46:13 -07:00
Eric Liang
23b6fdcda1
ray memory should collect statistics from all nodes () 2020-03-25 16:31:31 -07:00
Stephanie Wang
46404d8a0b
[core] Pin lineage of plasma objects that are still in scope ()
* Fix deadlock in DrainAndShutdown

* Revert "[core] Revert lineage pinning () ()"

This reverts commit ba86a02b37.

* debug rllib

* debug rllib

* turn on all rllib tests again

* debug rllib

* Fix drain bug, check number of pending tasks

* revert rllib debug

* remove todo

* Trigger rllib tests

* revert rllib debug commit
2020-03-25 09:29:32 -07:00
Richard Liaw
82b792be33
[tune] IP Check, Flatten Results for TBX ()
* support_flattened

* loggers

* Format logger changes

Co-authored-by: Kristian Hartikainen <kristian.hartikainen@gmail.com>
2020-03-25 09:18:03 +00:00
Maksim Smolin
e95455b7d7
[RaySGD] Add tqdm logging to TorchTrainer ()
* Update issue templates

* Init fp16

* fp16 and schedulers

* scheduler linking and fp16

* to fp16

* loss scaling and documentation

* more documentation

* add tests, refactor config

* moredocs

* more docs

* fix logo, add test mode, add fp16 flag

* fix tests

* fix scheduler

* fix apex

* improve safety

* fix tests

* fix tests

* remove pin memory default

* rm

* fix

* Update doc/examples/doc_code/raysgd_torch_signatures.py

* fix

* migrate changes from other PR

* ok thanks

* pass

* signatures

* lint'

* Update python/ray/experimental/sgd/pytorch/utils.py

* Apply suggestions from code review

Co-Authored-By: Edward Oakes <ed.nmi.oakes@gmail.com>

* should address most comments

* comments

* fix this ci

* first_pass

* add overrides

* override

* fixing up operators

* format

* sgd

* constants

* rm

* revert

* Checkpoint the basics

* End of day checkpoint

* Checkpoint log-to-head implementation

* Checkpoint

* Add actor-based batch log reporting, currently segfaults

* Work around progress segfault

* Fix some stuff in quicktorch

* Make things more customizable

* Quality of life fixes

* More quality of life

* Move tqdm logic to training_operator

* Update examples

* Fix some minor bugs

* Fix merge

* Fix small things, add pbar to dcgan

* Run format.sh

* Fix missing epoch number for batch pbar

* Address PR comments

* Fix float is not subscriptable

* Add train_loss to pbar by default

* Isolate tqdm code into a handler system

* Format

* Remove the batch_logs_reporter from distributed runner as well

* Check if the train_loss is avaialbale before using it

* Enable tqdm in the dcgan example

* Fix a crash in no-handler trainers

* Fix

* Allow not calling set_reporters for tests

Co-authored-by: Philipp Moritz <pcmoritz@gmail.com>
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
Co-authored-by: Edward Oakes <ed.nmi.oakes@gmail.com>
2020-03-24 23:43:56 -07:00
Richard Liaw
54a892bb84
[tune] Cancel Experiment via Client ()
* init cancel

* testing

* Update python/ray/tune/tests/test_tune_server.py

Co-Authored-By: Richard Liaw <rliaw@berkeley.edu>

* Apply suggestions from code review

* Apply suggestions from code review

* finished

* set_finished

Co-authored-by: ijrsvt <ian.rodney@gmail.com>
2020-03-24 20:30:12 -07:00
Simon Mo
a519b4f2a9
[Serve] Enhancement in HTTP Methods and Multi-route support () 2020-03-24 20:25:05 -07:00
Stephanie Wang
a1cee6af7b
Revert "New scheduler local node ()" ()
This reverts commit 6141fdab95.
2020-03-24 18:32:16 -07:00
Xianyang Liu
cc0490b55b
Several small fixes for function_manager () 2020-03-24 14:28:15 -07:00
Ion
6141fdab95
New scheduler local node () 2020-03-24 13:59:50 -05:00
fangfengbin
bf866de6fd
Enable GCS Service by default () 2020-03-24 14:20:23 +08:00
mehrdadn
b4030cdbbe
File HANDLE/descriptor translation layer for Windows ()
* Use TCP sockets on Windows with custom HANDLE <-> FD translation layer

* Get Plasma working on Windows

Co-authored-by: Mehrdad <noreply@github.com>
2020-03-23 21:08:25 -07:00
Robert Nishihara
2b80310e6f
Remove setup.py dependence on packaging. () 2020-03-23 16:21:17 -07:00
Edward Oakes
9318b29f5e
Remove is_direct logic from the raylet () 2020-03-23 17:09:35 -05:00
Richard Liaw
3fa2e4a346
[docs] Fix import breaking docs build ()
* psutil missing

* ok
2020-03-23 13:21:39 -07:00
Stephanie Wang
7f38cc1d03
Debug statements and increase timeout for test array () 2020-03-23 13:02:14 -07:00
Eric Liang
9a590ac6a5
[rllib] Fix custom model metrics in multi-device case ()
* fix example

* add example test

* lin
2020-03-23 12:40:22 -07:00
aannadi
8adc84ccb9
[Dashboard] Add sorted columns and TensorBoard to Tune tab () 2020-03-23 12:30:51 -07:00
Richard Liaw
e311013afd
[tune] Reformat Sections of API Reference ()
* moveit

* moveit

* docstrings to ref

* Update tune-usage.rst

Co-authored-by: Sven Mika <sven@anyscale.io>
2020-03-23 12:23:21 -07:00
Sven Mika
1138f2ebed
[RLlib] Issue 7046 cannot restore keras model from h5 file. () 2020-03-23 12:19:30 -07:00
Robert Nishihara
ee8c9ff732
Remove six and cloudpickle from setup.py. () 2020-03-23 11:42:05 -07:00
Robert Nishihara
1a0c9228d0
Remove pytest from setup.py and other minor changes. () 2020-03-23 08:46:56 -07:00
ZhuSenlin
74825db804
Fix TestGcsRedisFailureDetector ()
* fix test_gcs_redis_failure_detector

* fix test_gcs_redis_failure_detector

Co-authored-by: senlin.zsl <senlin.zsl@antfin.com>
2020-03-23 22:48:53 +08:00
Simon Mo
afad0ed085
[Serve] Add async, multi methods support for serve actors () 2020-03-23 00:45:26 -07:00
ZhuSenlin
039961b63a
rename ActorTable to LogBasedActorTable and add new ActorTable () 2020-03-23 15:05:43 +08:00