Commit graph

5493 commits

Author SHA1 Message Date
architkulkarni
de46464aa3
[Experimental] Queue: replace polling with async actor (#10120) 2020-08-19 11:55:42 -05:00
Sven Mika
2cbe29a7fa
[RLlib] Curiosity minor fixes, do-overs, and testing. (#10143) 2020-08-19 17:49:50 +02:00
Max Fitton
9c5e5a9757
[Dashboard] Fix and Recommit Reverted Group by Actor Class PR (#10186)
* Revert "Revert "[Dashboard] Group by Actor Class (#10147)" (#10180)"

This reverts commit e4d2ca620a.

* Fix metrics test to agree with the new logical view API

* lint2

Co-authored-by: Max Fitton <max@semprehealth.com>
2020-08-18 20:55:58 -07:00
Edward Oakes
ba0f531da0
[serve] Remove SLO code and blist dependency (#10075) 2020-08-18 17:52:36 -05:00
SangBin Cho
263df6163c
[Placement Group] Placement group remove api part 1 (#10063)
* Added basic rpc calls.

* fix issues.

* Fix the gcs server not getting request issue.

* In Progress.

* Basic logic done. Tests are required.

* In progress.

* In progress in refactoring context.

* Revert "In progress in refactoring context."

This reverts commit 38236256cf1306c60dd203e75d45ceb4509c8106.

* Working now.

* Python test works.

* Lint.

* Addressed code review.

* Addressed code review.

* Lint.

* Added unit tests.

* Done, but one of unit tests fail

* Addressed code review.

* Addressed the last code review.

* Fix the wrong test case.
2020-08-18 12:44:00 -07:00
Lixin Wei
d188becec2
[Python Worker] Add pid to log file name (#10149)
Co-authored-by: Alex Wu <alex@anyscale.io>
2020-08-18 11:48:48 -07:00
Simon Mo
bedc2c24c8
Export Metrics in OpenCensus Protobuf Format (#10080) 2020-08-18 11:32:42 -07:00
Max Fitton
8d06e30a06
[Dashboard] Fix Ray Dashboard command error messages (#10050) 2020-08-18 13:30:51 -05:00
Max Fitton
e4d2ca620a
Revert "[Dashboard] Group by Actor Class (#10147)" (#10180)
This reverts commit 71f6f83f1d.
2020-08-18 11:27:46 -07:00
Tomasz Wrona
aff7f19360
[tune] Added logger_config field (#8521)
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
2020-08-18 11:10:22 -07:00
Richard Liaw
eacf7dddba
update-code (#10106) 2020-08-18 09:28:32 -07:00
Arya Irani
f733d2648b
[docs] fix typo in deployment.rst (#10074)
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
2020-08-18 00:05:18 -07:00
Alex Wu
0b5d5ec17d
[Autoscaler] Pass custom resources to "ray start" multi instance autoscaling (#9986) 2020-08-17 22:34:07 -07:00
Max Fitton
71f6f83f1d
[Dashboard] Group by Actor Class (#10147)
* Update dashboard API to be able to pass actors in a flat structure in addition to nested.

* Working on adapting front-end to display UI w/ new actor class grouping

* wip

* Group logical view by actor class.

Co-authored-by: Max Fitton <max@semprehealth.com>
2020-08-17 22:03:50 -07:00
Barak Michener
13f62b74f3
cpp: Fix include order in the cpp api.h for generated files (#10161) 2020-08-17 22:02:57 -07:00
SangBin Cho
8cedcdf2df
[Tests] Fix test output (#10162)
* Trial 1.

* Fix.

* Revert "Fix."

This reverts commit 26ad970f753d581f340857be30054d6954df8255.

* Revert "Trial 1."

This reverts commit 63f7aca5162bb40f2d5e28fb9647598cbde7ad41.

* Another fix try.

* Last trial.

* Remove unnecessary comment.

* Small fix.

* Use better units.

* Lint.
2020-08-17 21:24:20 -07:00
Robert Nishihara
d45418936c
Skip failing tests on Windows. (#10139) 2020-08-17 18:56:17 -07:00
Richard Liaw
927a073226
[tune] Update node syncing documentation (#10126) 2020-08-17 18:08:27 -07:00
Amog Kamsetty
d3bac298d5
[Tune] PBT Error if metric not available (#9957) 2020-08-17 16:12:14 -07:00
Alex Wu
4b14bf85e4
[Autoscaler] Resource demand vector (hearbeat -> autoscaler plumbing) (#10127) 2020-08-17 13:57:15 -07:00
Sven Mika
fe0bdb23ff
[RLlib] Attention Net/Transformers docs improvement. 2020-08-17 13:07:17 -07:00
Eric Liang
ca133e2699
[rllib] Remove extra model config kwargs passed incorrectly for Torch models (#10055) 2020-08-17 11:12:20 -07:00
Noah
bd0b1488ef
[docs] Fix launching clusters link (#10157)
Not sure if this is the correct place to point, but better than a 404
2020-08-17 11:03:32 -07:00
Ian Rodney
a079f46c25
[autoscaler]/[docker] Cleanup YAMLs & Use RAY docker images (#10108) 2020-08-17 09:49:28 -07:00
SangBin Cho
053188dfbe
[Placement Group] Support Placement Group state table. (#10090)
* Done.

* Addressed code review.

* Linting.

* Fix lint.

* Fix lint.

* Fix a test.

* Lint.

* Add a lint sleep to test.

* Fix the lint issue.

* Fixed doc build error.
2020-08-17 09:24:50 -07:00
fangfengbin
edd783bc32
[Placement Group]Add soft pack strategy (#10099) 2020-08-17 12:01:34 +08:00
krfricke
8f0f7371a0
[tune] Added Kubernetes syncer and sync client (#10097)
Co-authored-by: Edward Oakes <ed.nmi.oakes@gmail.com>
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
Co-authored-by: Kai Fricke <kai@anyscale.com>
2020-08-16 14:09:28 -07:00
Julius Frost
dc659ae89a
make action probabilities a numpy array (#10122) 2020-08-16 11:25:12 -07:00
Philipp Moritz
c7adb464e4
[autoscaler] Fix run_env='host' for initialization commands (#10137) 2020-08-15 15:25:54 -07:00
Olli Huotari
9ff599cbb8
torch policy now includes model.metrics (#10121)
* torch policy now includes model.metrics

* Fixed tests to work with custom metrics

* Forgot to run format.sh
2020-08-15 10:43:11 -07:00
Sven Mika
aeb5be7733
[RLlib] Trajectory View API (part 2.5): Actual implementations (not used yet) of a SampleCollector. (#10112) 2020-08-15 15:09:00 +02:00
Sven Mika
2256047876
[RLlib] Rename rllib.utils.types into typing to match built-in python module's name. (#10114) 2020-08-15 13:24:22 +02:00
Chua Cheow Huan
ea51e94729
[rllib] Learning rate schedule for DDPPO. (#10006)
* Get shared metrics, increment counter & set global vars for remote workers.

* Add unit test to test lr_schedule for DDPPO.

* Broadcast the local set of global vars to remote workers instead of independently setting the global vars on each rollout worker.
2020-08-15 00:51:45 -07:00
Olli Huotari
ed6d1d7a7c
[Misc] Include info about flake8_quotes in format.sh (#10123) 2020-08-15 00:09:02 -07:00
Philipp Moritz
e95f0afe4c
[autoscaler] Expand key path for hashing with expanduser (#10125) 2020-08-14 18:50:27 -07:00
Amog Kamsetty
f87a4aa45d
[Tune] Pbt Function API (#9958)
* adding function convnet example

* add unit test

* update test

* update example

* wip

* move error from experiment to tune

* wip

* Fix checkpoint deletion

* updating code

* adding smoke test

* updating pbt guide

* formatting

* fix build

* add best checkpoint analysis util

* update test

* add comments

* remove class api

* fix example

* add setup and teardown to tests

* formatting

* Update python/ray/tune/tests/test_trial_scheduler_pbt.py

Co-authored-by: Kai Fricke <kai@anyscale.com>
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
2020-08-14 17:52:30 -07:00
Tao Wang
fba5906ce3
[GCS] Re-report heartbeat when gcs server restarts (#10040)
* Retry to send failed heartbeat when light heartbeat enalbed

* Re-report heartbeat when gcs server restarts

* remove is_pubsub_server_restarted

* add lock per comment

* minor change, name related
2020-08-14 17:37:20 -07:00
Siyuan (Ryans) Zhuang
17ca1d8ff4
[Core] Object spilling prototype (#9818) 2020-08-14 15:39:10 -07:00
Robert Nishihara
36e626e95d
Revert "[Dashboard] Start the new dashboard (#9860)" (#10116)
This reverts commit 739933e5b8.
2020-08-14 14:06:57 -07:00
chaokunyang
7ffb37f711
[Java] add maven repo (#10109) 2020-08-14 11:31:01 -07:00
Simon Mo
d0c2e90577
[Build] Make sure local format.sh check protobuf (#10118)
* Ignore protobuf files for clangformat check

* Revert "Ignore protobuf files for clangformat check"

This reverts commit ccd84d4e1517220eb4e946918174150ce2265467.

* Make sure protobuf is checked locally
2020-08-14 11:22:55 -07:00
Philipp Moritz
6b53df9599
Hash contents of SSH key instead of key path (#10103) 2020-08-14 00:10:31 -07:00
fangfengbin
3a6fa7d622
[Placement Group]Optimize placement group strict pack strategy (#9924)
* add part code

* add code

* add part code

* rm used import

* add part code

* add part code

* add part code

* add part code

* add part code

* add part code

* fix review comment

* add testcase

* use ResourceSet

* fix review comment

* fix ut bug

Co-authored-by: 灵洵 <fengbin.ffb@antfin.com>
2020-08-13 23:58:52 -07:00
Lixin Wei
0fe5722744
[Core] Add cached memory to unsued memory in Linux/BSD (#10084)
* add cached memory to available memory

* format

* bug fixed

* bug fixed

* fixed

* lint
2020-08-13 23:47:52 -07:00
Ian Rodney
a252aa29da
[docker] Wrap more internal items with run_env="host" (#10078) 2020-08-13 20:35:06 -07:00
SangBin Cho
16f6ee4914
Small Update (#10107) 2020-08-13 19:39:55 -07:00
SangBin Cho
55fe7f65a5
[Tests] Make test_output debugging easy (#10091)
* Fix.

* Fix.
2020-08-13 18:45:26 -07:00
Eric Liang
c9f13b0833
[Placement Groups] Support CUDA_VISIBLE_DEVICES (#10053) 2020-08-13 18:00:04 -07:00
Simon Mo
01f38bc5d1
CoreWorker correctly push metrics to agent (#10031) 2020-08-13 16:44:53 -07:00
Ícaro Aragão
b77d6bf87d
[GCS] Improve fallback for getting local valid IP for GCS server (#10004) 2020-08-13 16:29:47 -05:00