SangBin Cho
224933b5e4
[Placement Group] Remove API part 2 ( #10215 )
...
* Initial progress done.
* Fix mistake.
* Addressed code review.
* Fix cpp build issue.
* Addressed code review.
2020-08-20 09:50:13 -07:00
Sven Mika
d14b501692
[RLlib] First attempt at cleaning up algo code in RLlib: PG. ( #10115 )
2020-08-20 17:05:57 +02:00
Eric Liang
538cb802d5
[autoscaler] Refactor multi node type autoscaler config ( #10190 )
2020-08-19 20:46:00 -07:00
Richard Liaw
2fd59de05d
[autoscaler] hotfix - swallowed error for missing yaml ( #10212 )
2020-08-19 20:02:56 -07:00
Amog Kamsetty
9ff687c093
[SGD][Docs] docs for training/ validation results ( #10181 )
2020-08-19 17:22:28 -07:00
Simon Mo
a785106b47
[Doc] Remove experimental marker for asyncio API ( #10202 )
2020-08-19 16:52:50 -07:00
Amog Kamsetty
44e254788a
[Tune] PBT hyperparam_mutations improvements ( #10170 )
2020-08-19 16:50:19 -07:00
Eric Liang
5d265e9bd1
remove osx and linux actions ( #10209 )
2020-08-19 15:43:03 -07:00
architkulkarni
a3a9421787
added single quotes in pip install 'ray[rllib]'
2020-08-19 15:34:49 -07:00
Raphael Avalos
8b704eb419
Small fix for Cuda Torch DQN. ( #10177 )
2020-08-19 13:28:05 -07:00
Alex Wu
b70dce0d02
[autoscaler] Hotfix bad None check ( #10196 )
2020-08-19 13:27:20 -07:00
fangfengbin
9734dbca3e
[Placement Group]Reschedule bundles when the node of bundles is dead ( #10021 )
2020-08-19 13:24:42 -07:00
Edward Oakes
888f0a2c60
[serve] Use ray.experimental.metrics ( #10185 )
2020-08-19 13:03:22 -05:00
architkulkarni
de46464aa3
[Experimental] Queue: replace polling with async actor ( #10120 )
2020-08-19 11:55:42 -05:00
Sven Mika
2cbe29a7fa
[RLlib] Curiosity minor fixes, do-overs, and testing. ( #10143 )
2020-08-19 17:49:50 +02:00
Max Fitton
9c5e5a9757
[Dashboard] Fix and Recommit Reverted Group by Actor Class PR ( #10186 )
...
* Revert "Revert "[Dashboard] Group by Actor Class (#10147 )" (#10180 )"
This reverts commit e4d2ca620a
.
* Fix metrics test to agree with the new logical view API
* lint2
Co-authored-by: Max Fitton <max@semprehealth.com>
2020-08-18 20:55:58 -07:00
Edward Oakes
ba0f531da0
[serve] Remove SLO code and blist dependency ( #10075 )
2020-08-18 17:52:36 -05:00
SangBin Cho
263df6163c
[Placement Group] Placement group remove api part 1 ( #10063 )
...
* Added basic rpc calls.
* fix issues.
* Fix the gcs server not getting request issue.
* In Progress.
* Basic logic done. Tests are required.
* In progress.
* In progress in refactoring context.
* Revert "In progress in refactoring context."
This reverts commit 38236256cf1306c60dd203e75d45ceb4509c8106.
* Working now.
* Python test works.
* Lint.
* Addressed code review.
* Addressed code review.
* Lint.
* Added unit tests.
* Done, but one of unit tests fail
* Addressed code review.
* Addressed the last code review.
* Fix the wrong test case.
2020-08-18 12:44:00 -07:00
Lixin Wei
d188becec2
[Python Worker] Add pid to log file name ( #10149 )
...
Co-authored-by: Alex Wu <alex@anyscale.io>
2020-08-18 11:48:48 -07:00
Simon Mo
bedc2c24c8
Export Metrics in OpenCensus Protobuf Format ( #10080 )
2020-08-18 11:32:42 -07:00
Max Fitton
8d06e30a06
[Dashboard] Fix Ray Dashboard command error messages ( #10050 )
2020-08-18 13:30:51 -05:00
Max Fitton
e4d2ca620a
Revert "[Dashboard] Group by Actor Class ( #10147 )" ( #10180 )
...
This reverts commit 71f6f83f1d
.
2020-08-18 11:27:46 -07:00
Tomasz Wrona
aff7f19360
[tune] Added logger_config field ( #8521 )
...
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
2020-08-18 11:10:22 -07:00
Richard Liaw
eacf7dddba
update-code ( #10106 )
2020-08-18 09:28:32 -07:00
Arya Irani
f733d2648b
[docs] fix typo in deployment.rst ( #10074 )
...
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
2020-08-18 00:05:18 -07:00
Alex Wu
0b5d5ec17d
[Autoscaler] Pass custom resources to "ray start" multi instance autoscaling ( #9986 )
2020-08-17 22:34:07 -07:00
Max Fitton
71f6f83f1d
[Dashboard] Group by Actor Class ( #10147 )
...
* Update dashboard API to be able to pass actors in a flat structure in addition to nested.
* Working on adapting front-end to display UI w/ new actor class grouping
* wip
* Group logical view by actor class.
Co-authored-by: Max Fitton <max@semprehealth.com>
2020-08-17 22:03:50 -07:00
Barak Michener
13f62b74f3
cpp: Fix include order in the cpp api.h for generated files ( #10161 )
2020-08-17 22:02:57 -07:00
SangBin Cho
8cedcdf2df
[Tests] Fix test output ( #10162 )
...
* Trial 1.
* Fix.
* Revert "Fix."
This reverts commit 26ad970f753d581f340857be30054d6954df8255.
* Revert "Trial 1."
This reverts commit 63f7aca5162bb40f2d5e28fb9647598cbde7ad41.
* Another fix try.
* Last trial.
* Remove unnecessary comment.
* Small fix.
* Use better units.
* Lint.
2020-08-17 21:24:20 -07:00
Robert Nishihara
d45418936c
Skip failing tests on Windows. ( #10139 )
2020-08-17 18:56:17 -07:00
Richard Liaw
927a073226
[tune] Update node syncing documentation ( #10126 )
2020-08-17 18:08:27 -07:00
Amog Kamsetty
d3bac298d5
[Tune] PBT Error if metric not available ( #9957 )
2020-08-17 16:12:14 -07:00
Alex Wu
4b14bf85e4
[Autoscaler] Resource demand vector (hearbeat -> autoscaler plumbing) ( #10127 )
2020-08-17 13:57:15 -07:00
Sven Mika
fe0bdb23ff
[RLlib] Attention Net/Transformers docs improvement.
2020-08-17 13:07:17 -07:00
Eric Liang
ca133e2699
[rllib] Remove extra model config kwargs passed incorrectly for Torch models ( #10055 )
2020-08-17 11:12:20 -07:00
Noah
bd0b1488ef
[docs] Fix launching clusters link ( #10157 )
...
Not sure if this is the correct place to point, but better than a 404
2020-08-17 11:03:32 -07:00
Ian Rodney
a079f46c25
[autoscaler]/[docker] Cleanup YAMLs & Use RAY docker images ( #10108 )
2020-08-17 09:49:28 -07:00
SangBin Cho
053188dfbe
[Placement Group] Support Placement Group state table. ( #10090 )
...
* Done.
* Addressed code review.
* Linting.
* Fix lint.
* Fix lint.
* Fix a test.
* Lint.
* Add a lint sleep to test.
* Fix the lint issue.
* Fixed doc build error.
2020-08-17 09:24:50 -07:00
fangfengbin
edd783bc32
[Placement Group]Add soft pack strategy ( #10099 )
2020-08-17 12:01:34 +08:00
krfricke
8f0f7371a0
[tune] Added Kubernetes syncer and sync client ( #10097 )
...
Co-authored-by: Edward Oakes <ed.nmi.oakes@gmail.com>
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
Co-authored-by: Kai Fricke <kai@anyscale.com>
2020-08-16 14:09:28 -07:00
Julius Frost
dc659ae89a
make action probabilities a numpy array ( #10122 )
2020-08-16 11:25:12 -07:00
Philipp Moritz
c7adb464e4
[autoscaler] Fix run_env='host' for initialization commands ( #10137 )
2020-08-15 15:25:54 -07:00
Olli Huotari
9ff599cbb8
torch policy now includes model.metrics ( #10121 )
...
* torch policy now includes model.metrics
* Fixed tests to work with custom metrics
* Forgot to run format.sh
2020-08-15 10:43:11 -07:00
Sven Mika
aeb5be7733
[RLlib] Trajectory View API (part 2.5): Actual implementations (not used yet) of a SampleCollector. ( #10112 )
2020-08-15 15:09:00 +02:00
Sven Mika
2256047876
[RLlib] Rename rllib.utils.types into typing to match built-in python module's name. ( #10114 )
2020-08-15 13:24:22 +02:00
Chua Cheow Huan
ea51e94729
[rllib] Learning rate schedule for DDPPO. ( #10006 )
...
* Get shared metrics, increment counter & set global vars for remote workers.
* Add unit test to test lr_schedule for DDPPO.
* Broadcast the local set of global vars to remote workers instead of independently setting the global vars on each rollout worker.
2020-08-15 00:51:45 -07:00
Olli Huotari
ed6d1d7a7c
[Misc] Include info about flake8_quotes in format.sh ( #10123 )
2020-08-15 00:09:02 -07:00
Philipp Moritz
e95f0afe4c
[autoscaler] Expand key path for hashing with expanduser ( #10125 )
2020-08-14 18:50:27 -07:00
Amog Kamsetty
f87a4aa45d
[Tune] Pbt Function API ( #9958 )
...
* adding function convnet example
* add unit test
* update test
* update example
* wip
* move error from experiment to tune
* wip
* Fix checkpoint deletion
* updating code
* adding smoke test
* updating pbt guide
* formatting
* fix build
* add best checkpoint analysis util
* update test
* add comments
* remove class api
* fix example
* add setup and teardown to tests
* formatting
* Update python/ray/tune/tests/test_trial_scheduler_pbt.py
Co-authored-by: Kai Fricke <kai@anyscale.com>
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
2020-08-14 17:52:30 -07:00
Tao Wang
fba5906ce3
[GCS] Re-report heartbeat when gcs server restarts ( #10040 )
...
* Retry to send failed heartbeat when light heartbeat enalbed
* Re-report heartbeat when gcs server restarts
* remove is_pubsub_server_restarted
* add lock per comment
* minor change, name related
2020-08-14 17:37:20 -07:00