Sven Mika
e74947cc94
[RLlib] Env directory cleanup and tests. ( #13082 )
2021-01-19 10:09:39 +01:00
Sven Mika
93c0a5549b
[RLlib] Deprecate vf_share_layers
in top-level PPO/MAML/MB-MPO configs. ( #13397 )
2021-01-19 09:51:35 +01:00
Sven Mika
a65ee92b69
[RLlib] MARWIL loss function test case and cleanup. ( #13455 )
2021-01-19 09:51:05 +01:00
Todd A. Anderson
2506a6cd0e
Remove PYTHON_MODE that is not defined in Ray so that import * will work from other packages. ( #13544 )
2021-01-18 23:07:01 -08:00
SameerF
701038e410
Fix typo ( #13098 )
2021-01-18 19:28:10 -08:00
Richard Liaw
7a2997ea8c
[tune] support experiment checkpointing for grid search ( #13357 )
2021-01-18 19:24:36 -08:00
Ameer Haj Ali
1fbc3ddfac
Add ability to not start Monitor when calling ray start
( #13505 )
2021-01-18 18:31:53 -08:00
Simon Mo
fb16dd5265
Add Dashboard Python Test to Buildkite ( #13530 )
2021-01-18 17:20:45 -08:00
Simon Mo
6341f1fa2e
[Serve] Allow ObjectRef for Composition ( #12592 )
2021-01-18 15:26:35 -08:00
Kai Fricke
dc42abb2f5
[tune] placement group support ( #13370 )
2021-01-18 11:58:57 -08:00
Sven Mika
1f00f834ac
[RLlib] Solve PyTorch/TF-eager A3C async race condition between calling model and its value function. ( #13467 )
2021-01-18 10:29:03 -08:00
Tao Wang
516eb77080
[GCS] Remove task info publish as nowhere uses it ( #13509 )
...
* Remove task info publish as nowhere uses it
* simplify right publish channel
2021-01-18 01:15:03 -08:00
Simon Mo
1e2adb335e
[CI] Buildkite PR Environment for Simple Tests ( #13130 )
2021-01-18 00:44:24 -08:00
Tao Wang
3a0710130c
[GCS]Only publish changed field when node dead ( #13364 )
...
* Only update changed field when node dead
* node_id missed
2021-01-17 21:28:35 -08:00
ZhuSenlin
a4ebdbd7da
Refactor node manager to eliminate new_scheduler_enabled_
( #12936 )
2021-01-18 00:15:35 +08:00
ZhuSenlin
2cd51ce608
sync write internal config in gcs ( #13197 )
2021-01-17 12:00:01 +08:00
Eric Liang
8c8af2616e
Minimal version of piping autoscaler events to driver logs ( #13434 )
2021-01-16 10:06:20 -08:00
Dmitri Gekhtman
7e54911093
move message to debug ( #13472 )
2021-01-16 10:04:41 -08:00
Richard Liaw
86387504ee
[tune] fix small docs typo ( #13355 )
...
Signed-off-by: Richard Liaw <rliaw@berkeley.edu>
2021-01-16 00:49:17 -08:00
Amog Kamsetty
1d3941e41a
[Tests] Skip failing windows tests ( #13495 )
...
* skip failing windows tests
* skip more
* remove
* updates
2021-01-15 20:51:33 -08:00
SangBin Cho
1179db1fc2
Remove an unnecessary file ( #13499 )
2021-01-15 18:29:12 -08:00
Eric Liang
ee6332dbb0
Bump dev branch to 2.0 to avoid endless version bump toil ( #13497 )
...
* wip
* fix
* fix
2021-01-15 17:41:17 -08:00
Barak Michener
68e3a0e0e1
[ray_client]: fix wrong reference in server_pickler ( #13474 )
...
Change-Id: Ie3d219541b1875e986e72e3ae73ece145c715acf
2021-01-15 15:49:38 -08:00
SangBin Cho
d09df55b14
Update ID specification doc ( #13356 )
2021-01-15 15:15:51 -08:00
Eric Liang
4aeb0ea550
Return version info from Ray client connect, to allow for discovering version mismatches
2021-01-15 14:27:26 -08:00
Simon Mo
7a0597d03f
[CI] Fix Windows Bazel Upload ( #13436 )
2021-01-15 13:27:11 -08:00
Ian Rodney
0ec9ddabc1
[docker/dashboard] Fix ray dashboard ( #12899 )
2021-01-15 10:03:01 -08:00
Simon Mo
dac8b3d58a
[CI] Enable Dashboard tests for master ( #13425 )
2021-01-15 09:43:34 -08:00
SangBin Cho
f6d9996874
[Object Spilling] Dedup restore objects ( #13470 )
...
* done.
* Addressed code review.
2021-01-14 23:51:11 -08:00
fangfengbin
ce1b208e41
[GCS]Remove unused class variable ( #13454 )
2021-01-15 14:48:18 +08:00
Barak Michener
84e110a949
[ray_client]: Support runtime_context as metadata ( #13428 )
2021-01-14 14:37:00 -08:00
Clark Zinzow
9a658b568f
[Core] Ownership-based Object Directory: Consolidate location table and reference table. ( #13220 )
...
* Added owned object reference before Plasma put on Create() + Seal() path.
* Consolidated location table and reference table in reference counter.
* Restore type in definition.
* Clean up owned reference on failed Seal().
* Added RemoveOwnedObject test for reference counter.
* Guard against ref going out of scope before location RPCs.
* Add 'owner must have ref in scope' precondition to documentation for object location methods.
* Move to separate Create() + Seal() methods for existing objects.
* Clearer distinction between Create() and Seal() methods.
* Make it clear that references will normally be cleaned up by reference counting.
2021-01-14 13:48:10 -08:00
Siyuan (Ryans) Zhuang
d1e9887be2
[Serialization] New custom serialization API ( #13291 )
...
* new serialization API with doc & test
* add more notes
* refine notes
* doc
2021-01-14 13:15:31 -08:00
Amog Kamsetty
07e97fe4c2
[xgb] re-enable xgboost_ray tests ( #13416 )
...
* re-enable
* fix
* update xgb_ray version
2021-01-14 22:14:44 +01:00
Edward Oakes
7ba87b8abe
Fix getting runtime context dict in driver ( #13417 )
2021-01-14 14:41:53 -06:00
Ian Rodney
411e37ce3f
[serve] Properly obey SERVE_LOG_DEBUG=0 ( #13460 )
2021-01-14 12:24:22 -08:00
Simon Mo
16e8c4a69f
[Release] Fix Serve release test ( #13303 )
...
The Docker image we were using now uses `ray` users so we have to call
sudo.
2021-01-14 12:23:53 -08:00
Simon Mo
321bbe1ffb
[Dashboard] Fix GPU resource rendering issue ( #13388 )
2021-01-14 12:23:21 -08:00
PENG Zhenghao
e63da54931
[docs] Add more guideline on using ray in slurm cluster ( #12819 )
...
Co-authored-by: Sumanth Ratna <sumanthratna@gmail.com>
Co-authored-by: PENG Zhenghao <pengzh@ie.cuhk.edu.hk>
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
2021-01-14 12:17:53 -08:00
Sven Mika
d98235cc84
[RLlib] Deflake 2x remote & local inference tests (external env). ( #13459 )
2021-01-14 20:44:26 +01:00
Micah Yong
c89ebdd94a
[Core][CLI] ray status
and ray memory
no longer starts a new job ( #13391 )
...
* Access memory info in ray memory via GlobalStateAccessor rather than calling ray.init()
* Modify ray status cli so that it doesn't start a new job via ray.init()
* Remove local test file
* Access memory info in ray memory via GlobalStateAccessor rather than calling ray.init()
* Modify ray status cli so that it doesn't start a new job via ray.init()
* Remove local test file
* Make status and error args required in commands.py#debug.status
* Remove unnecessary imports
* Access memory info in ray memory via GlobalStateAccessor rather than calling ray.init()
* Modify ray status cli so that it doesn't start a new job via ray.init()
* Remove local test file
* Access memory info in ray memory via GlobalStateAccessor rather than calling ray.init()
* Modify ray status cli so that it doesn't start a new job via ray.init()
* Remove local test file
* Make status and error args required in commands.py#debug.status
* Remove unnecessary imports
* Job 38482.1 should now pass
* Resolve merge conflict
2021-01-14 10:12:16 -08:00
Dmitri Gekhtman
2d772a5a6d
[kubernetes][minor] Operator garbage collection fix ( #13392 )
2021-01-14 10:40:15 -06:00
Barak Michener
9c6d892eec
[ray_client]: fix exceptions raised while executing on the server on behalf of the client ( #13424 )
2021-01-14 10:38:01 -06:00
Ameer Haj Ali
2f7ba25efb
[joblib] joblib strikes again but this time on windows ( #13212 )
2021-01-14 10:36:52 -06:00
fangfengbin
4a6c53da46
[Core]Fix raylet scheduling bug ( #13452 )
...
* [Core]Fix raylet scheduling bug
* fix lint error
* fix lint error
Co-authored-by: 灵洵 <fengbin.ffb@antgroup.com>
2021-01-14 14:50:32 +01:00
Sven Mika
56878221ed
[RLlib] Redo: Make TFModelV2 fully modular like TorchModelV2 (soft-deprecate register_variables, unify var names wrt torch). ( #13363 )
2021-01-14 14:44:33 +01:00
fangfengbin
33b092de28
[GCS]Add gcs resource scheduler ( #13072 )
2021-01-14 20:05:55 +08:00
Kai Fricke
b296642646
Fix linter error ( #13451 )
2021-01-14 10:28:44 +01:00
Amog Kamsetty
560299972c
Revert "Enable Ray client server by default ( #13350 )" ( #13429 )
...
This reverts commit 912d0cbbf9
.
2021-01-13 21:28:54 -08:00
fyrestone
8697d67791
Fix raylet::MockWorker::GetProcess crashes ( #13440 )
...
Co-authored-by: 刘宝 <po.lb@antfin.com>
2021-01-14 12:19:21 +08:00