Commit graph

7107 commits

Author SHA1 Message Date
Amog Kamsetty
20acc3b05e
Revert "Inline small objects in GetObjectStatus response. (#13309)" (#13615)
This reverts commit a82fa80f7b.
2021-01-21 16:10:34 -08:00
Dmitri Gekhtman
87ca102c93
[Kubernetes] Unit test for cluster launch and teardown using K8s Operator (#13437) 2021-01-21 12:00:37 -06:00
Ian Rodney
68038741ac
[serve] Refactor BackendState to use ReplicaState classes (#13406) 2021-01-21 11:16:02 -06:00
Clark Zinzow
a82fa80f7b
Inline small objects in GetObjectStatus response. (#13309) 2021-01-21 09:15:18 -08:00
Kai Yang
92f1e0902e
[Java] Fix return of java doc (#13601) 2021-01-21 23:57:20 +08:00
Michael Luo
587f207c2f
[RLlib] Support for D4RL + Semi-working CQL Benchmark (#13550) 2021-01-21 16:43:55 +01:00
Saeid
d11e62f9e6
[RLlib] Fix problem in preprocessing nested MultiDiscrete (#13308) 2021-01-21 16:36:11 +01:00
Sven Mika
daf0bef285
[RLlib] Dreamer: Fix broken import and add compilation test case. (#13553) 2021-01-21 16:30:26 +01:00
Alex Wu
b9ac3878ae
[Autoscaler] Display node status tag in autsocaler status (#13561)
* .

* .

* .

* .

* .

* lint

Co-authored-by: Alex Wu <alex@anyscale.com>
2021-01-20 19:20:54 -08:00
Siyuan (Ryans) Zhuang
a09997dc9e
[Core] Remove 'PlasmaBuffer' in the buffer header (#13188) 2021-01-20 12:01:44 -08:00
Edward Oakes
b796de4104
[metrics] Check that all tag_keys are set when recording (#13420) 2021-01-20 13:09:44 -06:00
dmatch01
fd6882176a
Fix for operator role definition to add raycluster/finalizer (#13567) 2021-01-20 13:02:02 -06:00
Kai Fricke
8804758409
[xgboost] Add XGBoost release tests (#13456)
* Add XGBoost release tests

* Add more xgboost release tests

* Use failure state manager

* Add release test documentation

* Fix wording

* Automate fault tolerance tests
2021-01-20 18:40:23 +01:00
Eric Liang
e6412efdf5
Extra fix ray client newline (#13577) 2021-01-20 09:23:14 -08:00
ZhuSenlin
2e7c2b774f
[Core] add thread name to help performance profiling (#13506) 2021-01-20 20:34:28 +08:00
Kai Fricke
6c23bef2a7
[tune] Allow actor reuse for new trials (#13549)
* Allow actor reuse for new trials

* Fix tests and update conf when starting new trial

* Move magic config to `reset_trial`
2021-01-20 11:25:33 +01:00
Daan Klijn
800304acfb
[tune] wandb - WandbLogger now also accepts wandb.data_types.Video (#13169) 2021-01-20 01:19:54 -08:00
Eric Liang
d0f224d5cf
Revert "Pipe monitor.err logs to driver" (#13574)
This reverts commit a0d08c2cc6.
2021-01-20 00:29:19 -08:00
Tao Wang
b2a6e55289
[GCS]Only publish fileds used by sub clients in WorkerTableData (#13508) 2021-01-20 16:14:59 +08:00
Keqiu Hu
6c9088eb62
[core] refactor disconnect message processing and enrich WorkExitType (#13527)
* [core] refactor disconnect message processing and enrich WorkExitType

add changes from refactor pr

fix type typo

fix typo

fix

* address comments

* also update WorkerTableData

* fix tests
2021-01-19 22:09:46 -08:00
SangBin Cho
e544c008df
Fix restoration request dedup issues. (#13546) 2021-01-19 15:28:54 -08:00
Stephanie Wang
bfe147a6a8
Debug info to GCS pub sub (#13564) 2021-01-19 14:55:23 -08:00
Eric Liang
a0d08c2cc6
Pipe monitor.err logs to driver 2021-01-19 12:27:07 -08:00
Simon Mo
c963cbc038
Fix Docker Permission for Serve release test again (#13543) 2021-01-19 12:23:30 -08:00
Dmitri Gekhtman
7b4a97c610
Make AWSNodeProvider.create_node return nodes created (#13498)
* Make AWSNodeProvider.create_node return node config

* return-dict

* Node provider interface create node return type Any

* Type clarification.

* Delete debug code

* Oops reset example-full changes

* Return type specified. GCP create node returns None.

* Article
2021-01-19 12:17:46 -08:00
Amog Kamsetty
20016c983f
[Tune] MLflow Credentials (#13533) 2021-01-19 11:55:13 -08:00
Edward Oakes
9b071eb449
[metrics] Better validation for tags (#13421) 2021-01-19 13:26:51 -06:00
SangBin Cho
99375c4cfc
[Object Spilling] Remove retries and use a timer instead. (#13175) 2021-01-19 11:01:45 -08:00
fyrestone
86d5000047
Fix passing env on windows (#13253) 2021-01-19 10:04:38 -06:00
Sven Mika
2e3655e8a9
[RLlib] Issue 9071 A3C w/ RNN not working due to VF assuming no RNN. (#13238) 2021-01-19 14:22:36 +01:00
Sven Mika
e74947cc94
[RLlib] Env directory cleanup and tests. (#13082) 2021-01-19 10:09:39 +01:00
Sven Mika
93c0a5549b
[RLlib] Deprecate vf_share_layers in top-level PPO/MAML/MB-MPO configs. (#13397) 2021-01-19 09:51:35 +01:00
Sven Mika
a65ee92b69
[RLlib] MARWIL loss function test case and cleanup. (#13455) 2021-01-19 09:51:05 +01:00
Todd A. Anderson
2506a6cd0e
Remove PYTHON_MODE that is not defined in Ray so that import * will work from other packages. (#13544) 2021-01-18 23:07:01 -08:00
SameerF
701038e410
Fix typo (#13098) 2021-01-18 19:28:10 -08:00
Richard Liaw
7a2997ea8c
[tune] support experiment checkpointing for grid search (#13357) 2021-01-18 19:24:36 -08:00
Ameer Haj Ali
1fbc3ddfac
Add ability to not start Monitor when calling ray start (#13505) 2021-01-18 18:31:53 -08:00
Simon Mo
fb16dd5265
Add Dashboard Python Test to Buildkite (#13530) 2021-01-18 17:20:45 -08:00
Simon Mo
6341f1fa2e
[Serve] Allow ObjectRef for Composition (#12592) 2021-01-18 15:26:35 -08:00
Kai Fricke
dc42abb2f5
[tune] placement group support (#13370) 2021-01-18 11:58:57 -08:00
Sven Mika
1f00f834ac
[RLlib] Solve PyTorch/TF-eager A3C async race condition between calling model and its value function. (#13467) 2021-01-18 10:29:03 -08:00
Tao Wang
516eb77080
[GCS] Remove task info publish as nowhere uses it (#13509)
* Remove task info publish as nowhere uses it

* simplify right publish channel
2021-01-18 01:15:03 -08:00
Simon Mo
1e2adb335e
[CI] Buildkite PR Environment for Simple Tests (#13130) 2021-01-18 00:44:24 -08:00
Tao Wang
3a0710130c
[GCS]Only publish changed field when node dead (#13364)
* Only update changed field when node dead

* node_id missed
2021-01-17 21:28:35 -08:00
ZhuSenlin
a4ebdbd7da
Refactor node manager to eliminate new_scheduler_enabled_ (#12936) 2021-01-18 00:15:35 +08:00
ZhuSenlin
2cd51ce608
sync write internal config in gcs (#13197) 2021-01-17 12:00:01 +08:00
Eric Liang
8c8af2616e
Minimal version of piping autoscaler events to driver logs (#13434) 2021-01-16 10:06:20 -08:00
Dmitri Gekhtman
7e54911093
move message to debug (#13472) 2021-01-16 10:04:41 -08:00
Richard Liaw
86387504ee
[tune] fix small docs typo (#13355)
Signed-off-by: Richard Liaw <rliaw@berkeley.edu>
2021-01-16 00:49:17 -08:00
Amog Kamsetty
1d3941e41a
[Tests] Skip failing windows tests (#13495)
* skip failing windows tests

* skip more

* remove

* updates
2021-01-15 20:51:33 -08:00