Sven Mika
|
c9d220bcda
|
[RLlib] Upgrade RLlib regression test scripts to new testing tool - RLlib release logs for 1.4. (#16080)
|
2021-06-01 17:39:18 +02:00 |
|
Chris Bamford
|
1e3721ef4a
|
[RLlib] Remove bad spinlocks to allow pytorch GPU scheduler to interrupt. (#16162)
|
2021-06-01 16:40:28 +02:00 |
|
Alex Wu
|
de0f856b68
|
[namespaces] Isolation for named placement groups (#16000)
|
2021-06-01 05:50:19 -07:00 |
|
SangBin Cho
|
bfa8ebcae9
|
[Test] Fix flaky global gc test (#16154)
* fast global gc to fix flaky test
* lint
|
2021-06-01 00:17:03 -07:00 |
|
Chris K. W
|
31364ed9b4
|
[autoscaler] Autoscaler metrics (#16066)
Co-authored-by: Ian <ian.rodney@gmail.com>
|
2021-05-31 22:27:45 -07:00 |
|
Amog Kamsetty
|
da6f28d777
|
[Release] Add multi-node, multi-GPU SGD release test (#16046)
|
2021-05-31 16:23:04 -07:00 |
|
SangBin Cho
|
9fa3b9f6f3
|
[Nightly test] Test non streaming shuffle (#16150)
|
2021-05-31 15:28:02 -07:00 |
|
qicosmos
|
45d2331d5a
|
[C++ Woker] Remove ray core dependency completely (#16108)
|
2021-05-31 15:39:18 +08:00 |
|
Chong-Li
|
d5d0072635
|
Refactor RayletBasedActorScheduler (#16018)
|
2021-05-31 15:28:00 +08:00 |
|
SongGuyang
|
17b5f4dcaa
|
[C++ worker] support config from RayConfig and command line(gflag) (#16086)
|
2021-05-31 11:56:02 +08:00 |
|
zhuangzhuang131419
|
0429882bbf
|
[autoscaler] Implement node provider for aliyun (#15712)
Co-authored-by: Ian Rodney <ian.rodney@gmail.com>
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
Co-authored-by: zhuang <zhengchicheng.zcc@alibaba-inc.com>
Co-authored-by: chenk008 <kongchen28@gmail.com>
Co-authored-by: wuhua.ck <wuhua.ck@alibaba-inc.com>
|
2021-05-29 00:56:32 -07:00 |
|
Amog Kamsetty
|
38b657cb65
|
[Tune] Place remote tune.run on node running the client server (#16034)
* force placement on persistent node
* address comments
* doc
|
2021-05-28 18:32:57 -07:00 |
|
Amog Kamsetty
|
cfa2997b86
|
[XGBoost] Add test with Ray Client (#16103)
|
2021-05-28 16:13:06 -07:00 |
|
Sven Mika
|
5fe34862ce
|
[RLlib] DDPG torch GPU bug. (#16133)
|
2021-05-28 22:09:25 +02:00 |
|
Ian Rodney
|
5ca1b297e4
|
[RayClient][Proxy] BugFixes (#16040)
|
2021-05-28 10:24:48 -07:00 |
|
Ian Rodney
|
ec46794767
|
[Client] Add ray.client().disconnect() (#16021)
|
2021-05-28 10:15:44 -07:00 |
|
Lixin Wei
|
3d37e3a315
|
[Refactor] Replace FractionalResourceQuantity with FixedPoint (#16052)
* refactor
* fix
* fix compilation
* fix
* fix cross-platform compilation
* lint
* fix test
* Revert "fix test"
This reverts commit 0ff23b125ce4159b91cc170dbc17b5ed70c9ab11.
* change rounding to truncating
* Update BUILD.bazel
Co-authored-by: Eric Liang <ekhliang@gmail.com>
|
2021-05-28 09:32:51 -07:00 |
|
Sven Mika
|
33a69135cb
|
[RLlib] Issue 16117: DQN/APEX torch not working on GPU. (#16118)
|
2021-05-28 09:12:53 +02:00 |
|
Eric Liang
|
7f2e16fe8f
|
Make the spill test on unstable filesystem not so verbose (#16119)
* less logs
* update
* update
|
2021-05-27 20:32:48 -07:00 |
|
architkulkarni
|
cc7cc4fb9f
|
[Core] Allow specifying runtime_env conda and pip via filepath (#16073)
|
2021-05-27 17:58:47 -05:00 |
|
Clark Zinzow
|
a8ac383760
|
Decrease the number of nodes and actors started on each node in test_actor_multiple_gpus_from_multiple_tasks. (#16124)
|
2021-05-27 15:58:20 -07:00 |
|
Clark Zinzow
|
cd71d5e8ac
|
[Test] Ignore psutil.AccessDenied when gathering per-process memory info upon an OOM. (#16123)
|
2021-05-27 15:40:44 -07:00 |
|
Eric Liang
|
9c73591a4e
|
Revert "Fix tracing bug when actors are defined before connecting to … (#16120)
This reverts commit 6c1ea66611 .
|
2021-05-27 11:50:36 -07:00 |
|
Amog Kamsetty
|
5d3cb295bd
|
[Tune] Add find_free_port Tune util (#16098)
|
2021-05-27 11:27:28 -07:00 |
|
Edward Oakes
|
90a76ad558
|
[Serve] use placement group by default (#16113)
|
2021-05-27 11:03:29 -07:00 |
|
SangBin Cho
|
d0dc9abdfc
|
[Plasma store] Improve the OOM logging message. (#16051)
|
2021-05-27 10:09:58 -07:00 |
|
Yi Cheng
|
5d0b302121
|
[core] Trigger global gc when plasma store is under pressure. (#15775)
|
2021-05-27 10:07:59 -07:00 |
|
Tao Wang
|
881e4913f1
|
Don't broadcast empty resources data (#16104)
|
2021-05-27 10:06:32 -07:00 |
|
Kathryn Zhou
|
6c1ea66611
|
Fix tracing bug when actors are defined before connecting to cluster (#16069)
|
2021-05-27 09:28:11 -07:00 |
|
architkulkarni
|
65eab8f376
|
Revert "Revert "[Core] Add "env_vars" field to runtime_env"" (#16107)
|
2021-05-27 10:16:33 -05:00 |
|
SangBin Cho
|
94dc06d852
|
[Nightly test] improve error detection (#16102)
* improve error detection
* improve gitignore
* fix
|
2021-05-27 00:33:21 -07:00 |
|
DK.Pino
|
ea0ee86063
|
[Placement Group]Fix actor scheduling with Placement Group bug. (#16006)
|
2021-05-26 22:16:38 -07:00 |
|
Ian Rodney
|
69d0e8e4fe
|
[Docs][ClientBuilder] Add ray.client() and ray.ClientBuilder to Experimental API docs (#16058)
|
2021-05-26 21:05:47 -07:00 |
|
SongGuyang
|
a4c108e5f6
|
[C++ worker] delete unuseful test (#16082)
|
2021-05-27 11:23:59 +08:00 |
|
architkulkarni
|
7cfe7f840c
|
Revert "[Core] Add "env_vars" field to runtime_env (#16075)" (#16099)
This reverts commit 1e245005c9 .
|
2021-05-26 16:27:04 -07:00 |
|
Eric Liang
|
2f4628fdb4
|
Fix CHECK_FAIL when scheduling task with duplicate object requests (#16063)
|
2021-05-26 15:13:16 -07:00 |
|
Stephanie Wang
|
55bb1e93b4
|
[core] Wait for objects to be sealed before throwing OutOfMemory (#15955)
* Wait for objects to seal
* x
* comments
* error code
|
2021-05-26 14:18:32 -07:00 |
|
Eric Liang
|
3d1ba4a70e
|
Add feature flag for plasma overcommit (#16061)
|
2021-05-26 10:53:57 -07:00 |
|
architkulkarni
|
1e245005c9
|
[Core] Add "env_vars" field to runtime_env (#16075)
|
2021-05-26 12:11:19 -05:00 |
|
qicosmos
|
bbb61d0c00
|
[C++ Worker] remove core.h in api (#16079)
* remove core.h in api
* remove unused code and header
* remove core.h and some depencencies
* fix
|
2021-05-26 20:52:21 +08:00 |
|
qicosmos
|
498da13944
|
[C++ worker] Impove cpp worker (#15907)
|
2021-05-26 16:45:56 +08:00 |
|
qicosmos
|
d8f58e683f
|
[C++ worker] Add c++ worker log (#16015)
|
2021-05-26 16:13:02 +08:00 |
|
Kai Yang
|
853d650e29
|
Revert "Revert "[Object spilling] Avoid worker crash when an object is spille… (#15964)" (#16012)
This reverts commit 29aa336a4d .
|
2021-05-25 23:48:24 -07:00 |
|
Ian Rodney
|
3dbdd4eb46
|
[Client][Proxy] Track Num Clients in the proxy (#16038)
|
2021-05-25 22:17:43 -07:00 |
|
SongGuyang
|
7c3874b38e
|
remove id.h dependence for c++ worker headers (#16055)
|
2021-05-26 11:56:24 +08:00 |
|
Richard Liaw
|
08de5a36e1
|
[Horovod] Test with Ray Client (#15996)
Signed-off-by: Richard Liaw <rliaw@berkeley.edu>
|
2021-05-25 20:21:58 -07:00 |
|
Simon Mo
|
2aee4ac40d
|
[Serve] Cleanup examples and tests (#16042)
|
2021-05-25 15:32:36 -07:00 |
|
Xiang Xu
|
ec8b591f32
|
[docs] typo fix on the Doc for helm (#16036)
|
2021-05-25 12:59:39 -07:00 |
|
Sven Mika
|
e61922c4ac
|
[RLlib] Add one-liner to docs on internship/RL-engineer position. (#16050)
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
|
2021-05-25 12:58:54 -07:00 |
|
Ian Rodney
|
113fd6e765
|
[Client][Proxy] Refactor RayClient Proxy to not use additional Threads. (#16057)
|
2021-05-25 10:07:19 -07:00 |
|