Commit graph

3889 commits

Author SHA1 Message Date
AnanthHari
aa2a0cb6da Fixes empty state argument in compute_single_action method (#6894)
* Fixes empty `state` parameter in compute_single_action method

* Fixed style
2020-01-23 00:42:52 -08:00
Ujval Misra
1558307ac4 [tune] Prevent MEMORY checkpoints from breaking trial FT (#6691)
* Prevent MEMORY checkpoints from breaking FT

* Add save/pause/resume/restore test

* change checkpoint return value based on status

* Fix test_checkpoint_manager_tests.

* Fix test + checkpoint manager bug

* lint

* Add docstring

* Add docstring to checkpoint_manager constructor

* Change variable name for clarity

* Revert on_checkpoint docstring wording

* Break after success

* nit: more informative warning

* Quarantine test
2020-01-22 23:17:09 -08:00
Yunzhi Zhang
0834bda8c1 [Dashboard] Display actor task execution info (#6705)
Co-authored-by: Philipp Moritz <pcmoritz@gmail.com>
2020-01-22 22:33:55 -08:00
Sven Mika
ae9a3a2237 [RLlib] from_config util method for framework agnostic components; start moving RLlib tests into Bazel. (#6865) 2020-01-22 17:02:58 -08:00
Simon Mo
5f527816fe
Fix async actor high cpu utilization when idle (#6877) 2020-01-22 16:07:08 -08:00
Simon Mo
4dd41844d0
Ignore blocking ray.wait if timeout is zero (#6891) 2020-01-22 16:05:34 -08:00
Eric Liang
6bb30c9f1b fix links (#6883) 2020-01-22 01:06:07 -08:00
Richard Liaw
2b0e93586f
[autoscaler] Auto-replace "DEFAULT" with most recent DLAMI (#6848)
* try_this

* fix

* actual fix

* default
2020-01-21 13:54:04 -08:00
Richard Liaw
4edfaf2f38
[tune] Support callable objects in variant generation (#6849)
* minorcallable

* format
2020-01-21 10:24:25 -08:00
Frank Röder
dac6268c5b [tune] Fix broken link in Tune User Guide (#6866) 2020-01-21 10:21:14 -08:00
chaokunyang
289e5e8aff enable maven checkstyle (#6829) 2020-01-20 23:41:54 -08:00
Sven Mika
c957ed58ed [RLlib] Implement PPO torch version. (#6826) 2020-01-20 23:06:50 -08:00
Ce Gao
574abe844a [ray-operator] Remove useless RBAC rules (#6853)
Signed-off-by: Ce Gao <gaoce@caicloud.io>
2020-01-21 00:31:07 -06:00
Lingxuan Zuo
7e484687d3 Use GET-SET macro to reduce duplicated code. (#6863) 2020-01-21 10:57:57 +08:00
mehrdadn
139bf8908e Replace UNIX sockets with TCP sockets in Ray on Windows (#6823)
* Replace UNIX sockets with TCP sockets in Ray
2020-01-20 17:28:11 -08:00
Stephanie Wang
815cd0e39a
Task and actor fate sharing with the owner process (#6818)
* Add test

* Kill workers leased by failed workers

* merge

* shorten test

* Add node failure test case

* Fix FromBinary for nil IDs, add assertions

* Test

* Fate sharing on node removal, fix owner address bug

* lint

* Update src/ray/raylet/node_manager.cc

Co-Authored-By: Zhijun Fu <37800433+zhijunfu@users.noreply.github.com>

* fix

* Remove unneeded test

* fix IDs

Co-authored-by: Zhijun Fu <37800433+zhijunfu@users.noreply.github.com>
2020-01-20 16:44:04 -08:00
Eric Liang
14016535a5
[rllib] Add TF and Torch icons to show which are available for each algo (#6869) 2020-01-20 15:22:21 -08:00
Ce Gao
125e26dde5 [ray-operator] Watch the pod resource and remove useless code (#6852)
Signed-off-by: Ce Gao <gaoce@caicloud.io>
2020-01-20 12:13:30 -06:00
Ce Gao
23f32c5ec8 [ray-operator]: Add ignore file (#6851)
Signed-off-by: Ce Gao <gaoce@caicloud.io>
2020-01-20 12:13:01 -06:00
Philipp Moritz
96e2c1ae74
[Projects] Add small tutorial for projects (#6641) 2020-01-20 09:33:41 -08:00
mehrdadn
10609c3a19 Use standard EditorConfig file for editor settings (#6861)
https://editorconfig.org/

Co-authored-by: GitHub Web Flow <noreply@github.com>
2020-01-20 08:03:06 -08:00
Robert Nishihara
c2cbb85a43
Fix flaky test test_feature_flag (#6850) 2020-01-19 20:59:03 -08:00
Richard Liaw
341ddd0a09
[tune] Default to TensorboardX and include in requirements. (#6836) 2020-01-19 01:49:33 -08:00
Eric Liang
a229bdf272
[rllib] Deprecate custom preprocessors (#6833)
* deprecation warnings

* add log warn

* fix test
2020-01-18 23:30:09 -08:00
Richard Liaw
8a9bd18606
[tune] Remove keras dependency (#6827) 2020-01-18 23:24:42 -08:00
Richard Liaw
c9a1810392
[doc] Add meetup link (temporary) (#6835) 2020-01-18 17:53:47 -08:00
Sven Mika
7659cae3ba [RLlib] Add PG torch regression test (#6828)
* Add PG torch regression test to tuned_examples/regression_tests dir.

* Rename cartpole-pg.yaml into cartpole-pg-tf.yaml

* cartpole-pg-tf.yaml: Change cartpole-pg name of tuned_example to cartpole-pg-tf.
2020-01-18 15:57:12 -08:00
Justin Terry
97bf79917c [RLlib] Update MADDPG example repo to maintained fork (#6831) 2020-01-18 13:08:27 -08:00
Yuhao Yang
9b1d2953de [tune] set correct path when deleting checkpoint folder (#6758) 2020-01-17 23:11:03 -08:00
Sven Mika
303547f119 [RLlib] Policy-classes cleanup and torch/tf unification. (#6770) 2020-01-17 22:26:28 -08:00
Mitchell Stern
763818b476 [Dashboard] Add static assets for speedscope v1.5.3 (#6822) 2020-01-17 20:53:53 -08:00
Sven Mika
e6227082bd [RLlib] Add torch flag to train.py (#6807) 2020-01-17 18:48:44 -08:00
Yunzhi Zhang
3acf3c7675 [Dashboard] Add actor task counter (#6820) 2020-01-17 15:43:56 -08:00
Simon Mo
8f246c17b5
Initialize async plasma for async actors (#6813)
* Initialize async plasma for async actors

* Address comment
2020-01-17 14:58:06 -08:00
Ameer Haj Ali
9f9c3f5026 adding context parameter for pool with a warning for not being supported (#6776) 2020-01-17 16:57:18 -06:00
Richard Liaw
a3a268435f
[docs] Edit survey links (#6777) 2020-01-17 11:52:04 -08:00
Zhijun Fu
92380dd4e6 Fix crash in HandleObjectMissing when direct actor creation task is not found in local_queues_ (#6817) 2020-01-17 13:29:13 -06:00
Edward Oakes
30776450a3
num_cpus=1 by default in Pool (#6812) 2020-01-17 13:28:25 -06:00
Qstar
0f3205af0b [Projects] Delete pods associated with the project when running ray session stop (#6787) 2020-01-17 10:42:30 -08:00
Mitchell Stern
9f96091aef [Dashboard] Add logical view displaying actor tree (#6810)
* [Dashboard] Add logical view displaying actor tree

* Fix key error in test_raylet_info_endpoint
2020-01-17 10:25:27 -08:00
Yuhao Yang
5f36e6eacb [tune] get checkpoints paths for a trial after tuning (#6643) 2020-01-17 10:15:04 -08:00
chaokunyang
fa3c513276 [Streaming] Streaming filter transform (#6816)
* add filter transform

* lint

* add new line
2020-01-17 22:05:47 +08:00
micafan
e143f85ca0 [GCS] Use new interface class GcsClient in ray (#6805) 2020-01-17 14:51:18 +08:00
Mitchell Stern
8e8b66a4b8 Add route for /favicon.ico to fix missing favicon (#6815) 2020-01-16 21:03:21 -06:00
Richard Liaw
232be5a058
[sgd] fault tolerance for pytorch + revamp documentation (#6465) 2020-01-16 18:38:27 -08:00
fangfengbin
e5ad4e6f8d Add worker info handler to gcs service (#6798)
* add worker info handler

* rebase master

* add log

* remove unused variable

* fix code style
2020-01-16 22:35:00 +08:00
Mitchell Stern
05674c219f Accept any port in test_get_webui in test_webui.py (#6804) 2020-01-15 23:16:35 -06:00
mehrdadn
fb8e3615d5 Use Boost.Process instead of pid_t (#6510)
* Use Boost.Process instead of pid_t

This will let us handle child processes (mostly) uniformly across platforms.
TODO: There is no SIGTERM on Windows; achieving something equivalent is fairly involved.
2020-01-15 20:05:02 -08:00
fangfengbin
f9fa93eaf1 Add error info handler to gcs service (#6754)
* add error info accessor

* rebase master

* add function comments

* capture type instead of request
2020-01-16 11:59:00 +08:00
Edward Oakes
b750bd7fc9
Use 2xlarge instances in long running tests (#6802) 2020-01-15 19:47:59 -06:00