Commit graph

9353 commits

Author SHA1 Message Date
Guyang Song
b97027ec64
[C++ API] support cpu gpu num 0 (#17783)
* support cpu gpu num 0

* support cpu gpu num 0

* fix
2021-08-13 08:45:33 +08:00
Robert Nishihara
f624ddae5f
Remove outdated link from readme (#17788) 2021-08-12 17:11:59 -07:00
Yi Cheng
e32d33f39c
Fix ray.init hanging due to failure. (#17732)
* up

* change to 30s

* up

* up

* format
2021-08-12 16:56:10 -07:00
wanxing
e4c8125c86
Make some function private (#17729)
* ReceiveObjectChunk

* more
2021-08-12 15:27:37 -07:00
Eric Liang
7fc62a1529
Support dataset union (#17793) 2021-08-12 14:01:40 -07:00
Lixin Wei
d287fc941b
[Core] Add Running Count to instrumented_io_context (#17664) 2021-08-12 13:56:40 -07:00
Chen Shen
9565fa549e
[Core][RFC] limit the total number of inlined bytes in task request rpc
Co-authored-by: Clark Zinzow <clarkzinzow@gmail.com>
2021-08-12 13:55:54 -07:00
Simon Mo
ec8409ff06
Add @architkulkarni to snapshot code owner (#17785) 2021-08-12 10:58:02 -07:00
Simon Mo
6879293b6b
[CI] Mark some tests exclusive (#17650) 2021-08-12 10:28:03 -07:00
Guyang Song
88b8de5904
[C++ API] support ray::IsInitialized (#17780)
* support ray::IsInitialized

* address comments

* fix
2021-08-13 00:51:26 +08:00
SangBin Cho
8fd7e025be
Skip raylet kill windows #17682 (#17683)
* Try fixing it?

* Done

* skip raylet signal
2021-08-12 09:35:44 -07:00
matthewdeng
55680a1f9e
[SGD] v2 initial checkpoint functionality (#17632)
* [SGD] initial checkpoint functionality

* remove thread implementation and merge with fetch_next_result

* Update comment

Co-authored-by: Amog Kamsetty <amogkam@users.noreply.github.com>

* address comments

* add additional tests

* fix imports

* load most recently saved checkpoint

Co-authored-by: Amog Kamsetty <amogkam@users.noreply.github.com>
2021-08-12 08:52:04 -07:00
Clark Zinzow
d6eeb5dc70
[Datasets] Add local and S3 filesystem test coverage for file-based datasources. (#17158) 2021-08-12 08:39:31 -07:00
Guyang Song
e53aeca6bb
[C++ API]support set resources in RayConfig (#17779) 2021-08-12 22:53:42 +08:00
Guyang Song
5713a0be6c
[C++ API] add C++ API docs (#17743) 2021-08-12 22:40:09 +08:00
mguarin0
3e010c5760
[rllib] bug fix for rllib pettingzoo pistonball_v4 example (#17701)
* bug fix for rllib pettingzoo pistonball_v4 example

* adding test for PR 17701

* ran scripts/format.sh

* ok

Signed-off-by: Richard Liaw <rliaw@berkeley.edu>

Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
2021-08-12 00:25:00 -07:00
architkulkarni
00f6b30684
[Serve] [Dashboard] Support nondetached and multiple Serve instances in cluster snapshot (#17747) 2021-08-11 22:26:54 -05:00
Eric Liang
ce171f10a1
Remove legacy plasma unlimited and pull manager pinning flag (#17753) 2021-08-11 20:19:12 -07:00
Guyang Song
63f9ba2858
[C++ API][Fix] support ray::Init without RayConfig (#17733) 2021-08-12 10:59:21 +08:00
Kai Yang
ab53c5fc93
[Java] Update rolling logging configuration (#17741) 2021-08-12 10:15:27 +08:00
Qing Wang
6d6a1ea43e
Support reading system configs from native in Java. (#17703)
* Support reading system configs from native in Java.

* Fix lint

* Lint cpp

* Fix Java cases.

* Address comments.

* Address comments.
2021-08-12 10:06:01 +08:00
Clark Zinzow
623db7c47b
[Datasets] Add support for reading partitioned Parquet datasets. (#17716) 2021-08-11 15:55:49 -07:00
Jiao
3c64a1a3c1
Add micro benchmark to releaser repo (#17727) 2021-08-11 15:15:33 -07:00
architkulkarni
9a70e83e90
[hotfix] pin tensorflow==2.5.1 (#17760)
* pin tensorflow==1.5.1

* Update python/requirements.txt

Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
2021-08-11 15:15:22 -07:00
Yi Cheng
aa96e59faf
[workflow] Examples of function chaining (#17715) 2021-08-11 13:15:51 -07:00
Eric Liang
71b3183038
Add implicit init note to Ray docs & dataset version note (#17751) 2021-08-11 13:13:22 -07:00
Yi Cheng
02e79f3fe5
Revert "[Observability] Export useful metrics (#17578)" (#17752)
This reverts commit bd4db53df2.
2021-08-11 12:21:50 -07:00
Jiao
e38db5875b
Add serve external kv store (#17622) 2021-08-11 12:06:14 -07:00
Amog Kamsetty
ed24bae644
[SGD] Fail if num_workers is not greater than 0 (#17723) 2021-08-11 10:05:19 -07:00
Ian Rodney
97f7ae5e06
[Cluster Launcher] Allow attach/exec on uninitialized head node (#17688) 2021-08-11 09:43:23 -07:00
Sven Mika
7f2b3c0824
[RLlib] Issue 17667: CQL-torch + GPU not working (due to simple_optimizer=False; must use simple optimizer!). (#17742) 2021-08-11 18:30:21 +02:00
chenk008
f0fc26960d
[sgd] Wait for placement_group deletion when shutdown worker_group (#17698)
* fix

* fix ut

* delete sleep

* fix according to comment

* fix according to comment

* use pg in test_resize

* fix
2021-08-11 08:47:49 -07:00
Tricia Fu
24c4220bd7
[doc][serve] Update http-servehandle.rst (#17680) 2021-08-11 10:39:58 -05:00
Julius Frost
6891dee6ea
[RLlib] Better exceptions with traceback in TorchPolicy (#17690) 2021-08-11 15:01:07 +02:00
Sven Mika
811d71b368
[RLlib] Issue 17653: Torch multi-GPU (>1) broken for LSTMs. (#17657) 2021-08-11 12:44:35 +02:00
Sven Mika
29f20cccb6
[RLlib] Issue 17706: AttributeError: 'numpy.ndarray' object has no attribute 'items'" on certain turn-based MultiAgentEnvs with Dict obs space. (#17735) 2021-08-11 12:33:35 +02:00
SongGuyang
4176e43ef2
Remove binary printing from RAY_CHECK log (#17728) 2021-08-11 18:32:12 +08:00
J K Terry
48e32555c8
[rllib] Update PettingZoo dependency versions (#17702)
* update pettingzoo dependency versions

* pettingzoo verison

* fix tests
2021-08-11 01:19:19 -07:00
Shantanu
abc593561c
[client] fix ClientRemoteMethod error message (#17726)
Co-authored-by: hauntsaninja <>
2021-08-11 00:43:17 -07:00
Julius Frost
9322f6aab5
[rllib] Fix classes decorated with @Deprecated to be classes instead of methods (#17666)
* fix deprecated classes from being methods

* format
2021-08-10 18:25:31 -07:00
Yi Cheng
bd4db53df2
[Observability] Export useful metrics (#17578)
* up

* up

* up

* up

* up

* up

* up

* up

* up

* up

* up

* up

* up

* checkpoint

* up

* up

* up

* up

* fix

* up

* up

* up

* up

* up

* up

* up

* up

* up

* up

* add comments

* up

* up

* up

* up

* add tests
2021-08-10 17:14:42 -07:00
architkulkarni
0c2c99b951
[Dashboard] [Serve] Make serve import conditional (#17713) 2021-08-10 17:06:00 -07:00
SongGuyang
63c15d7ced
[core] make 'PopWorker' to be an async function (#17202)
* make 'PopWorker' to be an async function

* pop worker async works

* fix

* address comments

* bugfix

* fix cluster_task_manager_test

* fix

* bugfix of detached actor

* address comments

* fix

* address comments

* fix aioredis

* Revert "fix aioredis"

This reverts commit 041b983eac95b105ab0e853e84c4cf2647008431.

* bug fix

* fix

* fix test_step_resources test

* format

* add unit test

* fix

* add test case PopWorkerStatus

* address commit

* fix lint

* address comments

* add python test

* address comments

* make an independent function

* Update test_basic_3.py

Co-authored-by: Hao Chen <chenh1024@gmail.com>
2021-08-10 17:03:17 -07:00
SangBin Cho
a3c5cce834
Add prepare for dask on ray 1tb sort. (#17708) 2021-08-10 16:26:05 -07:00
xwjiang2010
932f038644
[tune] Type hint TrialExecutor. Use Abstract Base Class. (#17584) 2021-08-10 14:17:22 -07:00
Clark Zinzow
78d23434e6
[Datasets] Fix write_json so roundtrip writing + reading works. (#17691)
* Write out dataset blocks as newline-delimited JSON.

* Add roundtrip JSON reading + writing test.

* Formatting.
2021-08-10 13:24:33 -07:00
SangBin Cho
705a7192b3
Unflake multi node 3 (#17694) 2021-08-10 13:16:52 -07:00
architkulkarni
febe54f422
[serve] [dashboard] Change empty serve cluster snapshot from empty list to empty dict (#17655) 2021-08-10 13:35:00 -05:00
Amog Kamsetty
0b8489dcc6
Revert "[RLlib] Add support for multi-GPU to DDPG. (#17586)" (#17707)
This reverts commit 0eb0e0ff58.
2021-08-10 10:50:21 -07:00
Amog Kamsetty
77f28f1c30
Revert "[RLlib] Fix Trainer.add_policy for num_workers>0 (self play example scripts). (#17566)" (#17709)
This reverts commit 3b447265d8.
2021-08-10 10:50:01 -07:00