Commit graph

6497 commits

Author SHA1 Message Date
SangBin Cho
f6f9b15299
. (#11998) 2020-11-12 21:33:00 -08:00
Ian Rodney
3b56a1a522
[docker] auto-populate shared memory size (#11953) 2020-11-12 17:22:42 -08:00
Michael Luo
59bc1e6c09
[RLLib] MAML extension for all models except RNNs (#11337) 2020-11-12 16:51:40 -08:00
Barak Michener
272edcca94
[ray_client]: Implement function calls (#11922) 2020-11-12 16:49:34 -08:00
Eric Liang
a6a8e777f3
[autoscaler] Interpret autoscaling_speed as 1/x-1 of previous target util fraction (#11961)
* tweak

* update
2020-11-12 16:23:50 -08:00
Sven Mika
0bd69edd71
[RLlib] Trajectory view API: enable by default for ES and ARS (#11826) 2020-11-12 10:33:10 -08:00
Michael Luo
6e6c680f14
MBMPO Cartpole (#11832)
* MBMPO Cartpole Done

* Added doc
2020-11-12 10:30:41 -08:00
Ian Rodney
9254de0b02
[autoscaler] Fix custom node resources on head (#11896) 2020-11-12 10:30:04 -08:00
Gekho457
ad639f12d8
[autoscaler/k8s] Preliminary k8s operator (#11929) 2020-11-12 11:58:02 -06:00
Gabriele Oliaro
4744ed01f7
Queueing non-actor tasks at the workers (#11051)
* separated adding tasks to queue and executing them (worker side)

* linting

* first review

* second rev

* rev3, all tests passing locally

* linting

* rev4

* linting

* finished rev4, all tests passing locally (mac)

* rev4, all tests passing locally

* linting

* rev5

* bug fix

* hopefully fixed build

* nvm

* ptr cast

* linting

* no special treatment for actor creation tasks
2020-11-12 12:44:13 -05:00
Kai Fricke
02c02369ca
[tune] Fix hpo randint limits (#11946)
Co-authored-by: Sumanth Ratna <sumanthratna@gmail.com>
2020-11-12 08:45:49 -08:00
Kristian Hartikainen
07f401d99d
[tune] Fix unflatten dict (#11948) 2020-11-12 08:43:15 -08:00
Lee moon soo
9920933e31
[docker] Support non-root container (#11407) 2020-11-12 08:41:50 -08:00
Sven Mika
62c7ab5182
[RLlib] Trajectory view API: Enable by default for PPO, IMPALA, PG, A3C (tf and torch). (#11747) 2020-11-12 16:27:34 +01:00
Michael Luo
59ccbc0fc7
[RLlib] Model Annotations: Tensorflow (#11964) 2020-11-12 12:18:50 +01:00
Michael Luo
b2984d1c34
[RLlib] Model Annotations to Torch Models (#9749) 2020-11-12 12:16:12 +01:00
Tao Wang
3fbd8be851
[Placement Group]Do not really subtract resources, just count (#11894)
* [Placement Group]Do not really subtract resources, just count

* add todo
2020-11-12 00:01:19 -08:00
SangBin Cho
f80d812799
[Object Spilling] Introduce SpillWorker & RestoreWorker Pool to avoid IO worker deadlock. (#11885) 2020-11-11 18:20:14 -08:00
Barak Michener
de6df51bd2
[redis, docs]: Bump redis and docs/Pillow dependencies (#11371) 2020-11-11 18:15:27 -08:00
Max Fitton
f545418c3f
[Dashboard] Fix dashboard regression caused by logCount and errCount being removed from worker payload (#11954) 2020-11-11 14:55:54 -08:00
Edward Oakes
73a1cb702b
Split _get_node_provider_cls off from _get_node_provider (#11949) 2020-11-11 16:10:46 -06:00
Ameer Haj Ali
85197deece
[autoscaler] Remove legacy autoscaler (#11802) 2020-11-11 13:36:48 -08:00
Sven Mika
72fc79740c
[RLlib] Issue with pickle versions (breaks rollout test cases in RLlib). (#11939) 2020-11-11 21:52:21 +01:00
dHannasch
396ae0b7c2
Add docstring for find_redis_address (#11884) 2020-11-11 12:24:36 -06:00
Sven Mika
291c172d83
[RLlib] Support Simplex action spaces for SAC (torch and tf). (#11909) 2020-11-11 18:45:28 +01:00
Kai Yang
4735c032ed
[Core] Fix C++ worker test (#11941) 2020-11-11 09:04:45 -08:00
Tao Wang
92286660e4
[Core] Lazy create node manager clients, and destroy then (#11928) 2020-11-11 08:51:40 -08:00
SangBin Cho
7b8bd15702
[Stalebot] Fix issues. (#11930) 2020-11-11 00:28:02 -08:00
Siyuan (Ryans) Zhuang
b8dda0e3d0
[Serialization] Fix buffer alignment issues (#11888)
* fix buffer alignment issues

* remove unused fields

* aligned memory allocation

* windows compat

* license. fix compiler warnings

* fix compilation error

* reinterpret_cast
2020-11-10 23:44:16 -08:00
chaokunyang
1979ea9c0a
fix disable javadoc lint (#11907) 2020-11-11 13:40:50 +08:00
dHannasch
29cb32539e
[Core] If failed to connect to redis, try to say why. (#11916) 2020-11-10 18:22:10 -08:00
fangfengbin
433e4f32da
[GCS]Reduce get operations of worker table (#11599)
* [GCS]Reduce get operations of worker table

* fix ut bug

* fix ut bug

* fix review comment

Co-authored-by: 灵洵 <fengbin.ffb@antfin.com>
2020-11-10 18:11:25 -08:00
Alex Wu
8afd2acdc1
[Autoscaler] simulator placement groups (#11777) 2020-11-10 18:10:36 -08:00
Eric Liang
46f3652102
Remove repeat push timeout from object manager (#11874) 2020-11-10 16:26:53 -08:00
Keqiu Hu
0c1bdaef59
[tune] TensorFlow Distributed Trainable (#11876)
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
2020-11-10 14:59:08 -08:00
Richard Liaw
50dbf1a307
[core] Support configurable number of "check for redis" attempts (#11902)
Signed-off-by: Richard Liaw <rliaw@berkeley.edu>
2020-11-10 14:57:57 -08:00
Ian Rodney
1d158dda32
[serve] Rename to use replicas, not workers (#11822) 2020-11-10 11:36:15 -08:00
Eric Liang
9b8218aabd
[docs] Move all /latest links to /master (#11897)
* use master link

* remae

* revert non-ray

* more

* mre
2020-11-10 10:53:28 -08:00
fangfengbin
543f7809a6
[GCS]Add gcs dump log(Part1) (#11727)
* add part code

* fix compile bug

* Fix bug

* Add part code

* fix review comment

* fix review comment

* fix lint error

* fix review comment

Co-authored-by: 灵洵 <fengbin.ffb@antfin.com>
Co-authored-by: 灵洵 <fengbin.ffb@antgroup.com>
2020-11-10 14:10:03 +08:00
Nikita Vemuri
aba9288615
[Autoscaler] Introduce callback system (#11674)
Co-authored-by: Nikita Vemuri <nikitavemuri@Nikitas-MacBook-Pro.local>
Co-authored-by: Xiayue Charles Lin <xcl@anyscale.com>
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
2020-11-09 20:03:15 -08:00
Eric Liang
ee2da0cf45
[Core] PushManager for reliable broadcast (#11869) 2020-11-09 18:01:47 -08:00
Benjamin Black
1999266bba
Updated pettingzoo env to acomidate api changes and fixes (#11873)
* Updated pettingzoo env to acomidate api changes and fixes

* fixed test failure

* fixed linting issue

* fixed test failure
2020-11-09 16:09:49 -08:00
Eric Liang
a9cf0141a0
[autoscaler] Fix semantics of request_resources (#11820) 2020-11-09 14:57:40 -08:00
Edward Oakes
1c132f2ff8
[serve] Improve DEBUG logging for understanding perf (#11838) 2020-11-09 14:10:42 -06:00
architkulkarni
adcaabcd64
[Serve] Reconfigure backend class at runtime (#11709) 2020-11-09 14:04:51 -06:00
Kai Fricke
287aba6dc3
[tune] schedulers: Add test for context finalization (#11889) 2020-11-09 11:37:05 -08:00
Richard Liaw
a09e49ee94
[core] Add retry for reading session name (#11844) 2020-11-09 11:22:50 -08:00
Kai Fricke
88be1ea20b
[tune] Handle infinite and NaN values (#11835) 2020-11-09 11:18:31 -08:00
Kai Yang
904f48ebd9
[Core] Multi-tenancy: Pass job ID from Raylet to worker via env variable (#11829)
* Pass job ID from Raylet to worker via env variable

* fix

* fix

* fix

* lint

* fix

* fix test_object_spilling

* address comments

* lint

* fix
2020-11-09 11:02:15 -08:00
Tao Wang
77e3163630
[GCS]Only pass node id to node failure detector (#11886)
* [GCS]Only pass node id to node failure detector

* rename
2020-11-09 10:52:33 -08:00