Commit graph

9669 commits

Author SHA1 Message Date
Sven Mika
828f5d26b7
[RLlib] Custom view requirements (e.g. for prev-n-obs) work with compute_single_action and compute_actions_from_input_dict. (#18921) 2021-09-30 15:03:37 +02:00
Avnish Narayan
6dc1a6b72f
[RLlib] Raise error for kl penalty ddpo (#18959)
* [RLlib] Raise error for kl penalty ddpo

DDPPO doesn't support KL penalties like PPO-1.
In order to support KL penalties, DDPPO would need to
become undecentralized, which defeats the purpose of the
algorithm. Users can still tune the entropy coefficient to
control the policy entropy (similar to controlling the KL
penalty.)

* Update rllib/agents/ppo/ddppo.py

Co-authored-by: avnishn <avnishnarayan@gmail.com>
Co-authored-by: Sven Mika <sven@anyscale.io>
2021-09-30 10:56:22 +02:00
Chris K. W
291fd36dee
_ray_trace_ctx fix follow-up (#18950)
* sanity check

* add test case

* fix assert

* refactor

* check kwargs instead of _kwargs

* format
2021-09-29 23:53:04 -07:00
Sven Mika
05a55a9335
[RLlib] Issue 18668: Unity3D env client/server example not working (fix + add to test cases). (#18942) 2021-09-30 08:30:20 +02:00
SangBin Cho
55227a15b9
Handle retry to avoid statement timeout exception/ (#18968) 2021-09-29 23:04:35 -07:00
Clark Zinzow
74b5d3d8f7
[Datasets] Minimize truncation on balanced splits. (#18953)
* Minimize truncation on balanced splits.

* Refactor into subroutines.

* Feedback and fixes.
2021-09-29 21:57:08 -07:00
Alex Wu
5709c6501b
[dataset][usability] Dataset dependencies (#18346) 2021-09-29 17:29:31 -07:00
Yi Cheng
a993f3a262
[nightly] update nightly test for many node test 2021-09-29 17:28:44 -07:00
Clark Zinzow
73a6cda812
Handle empty datasets properly in most Dataset transformations. (#18983) 2021-09-29 17:27:03 -07:00
Eric Liang
aa985e1a9c
Fix false positive error message from autoscaler events (#18981) 2021-09-29 15:51:18 -07:00
Jiajun Yao
be29d27e8a
[Scalability Envelope] Include broadcast time in test_object_store result json (#18974) 2021-09-29 13:49:16 -07:00
Stephanie Wang
5eddaabd11
[core] Fix bug in dependency resolution for actor handles (#18862)
* x

* lint
2021-09-29 13:25:31 -07:00
Antoni Baum
573c66a755
[GCP] Update GCP TPU config (#18634)
* [autoscaler] Update GCP TPU config

* Preemptible by default

* Remove libtpu link from head node

* Workaround
2021-09-29 12:41:26 -07:00
Sven Mika
9c9b482661
[RLlib] Allow n-step > 1 and prio. replay for R2D2 and RNNSAC. (#18939) 2021-09-29 21:31:34 +02:00
Sven Mika
b99943806e
[RLlib] Add support for IMPALA to handle more than one loss/optimizer (analogous to recent enhancement for APPO). (#18971) 2021-09-29 21:30:04 +02:00
Jiajun Yao
ed9118393c
Listen to 127.0.0.1 by default on mac osx (#18904) 2021-09-29 11:40:19 -07:00
Eric Liang
3665c99896
Deflake test_failure_2.py::test_warning_for_infeasible_zero_cpu_actor 2021-09-29 11:39:16 -07:00
Dmitri Gekhtman
944309c017
Revert "[nightly] Deflaky nightly test many_nodes_actor_test (#18582)" (#18954)
* Revert "[nightly] Deflaky nightly test many_nodes_actor_test (#18582)"

This reverts commit fc6a739e4b.

* move to large test

Co-authored-by: Yi Cheng <chengyidna@gmail.com>
2021-09-29 11:02:14 -04:00
Jiajun Yao
35774fd399
[CI] Print out the mismatched commit in ci (#18956) 2021-09-29 15:48:57 +01:00
Chong-Li
42744f29ee
[GCS] Make Gcs-based actor scheduler's bookkeeping consistent (#18546)
* Make Gcs-based scheduler's bookkeeping consistent

* Remove this from lambda function

* Fix lambda function

* Trigger SchedulePendingActors

* Test for acquiring/releasing resources

* Reorganize structure

* Avoid overloading post

* Fix gcs_actor_manager_test

* Fix post counter and rename some func

* Fix unique_ptr

* Fix unique_ptr

* Fix book lint error

* Lint

Co-authored-by: Chong-Li <lc300133@antgroup.com>
2021-09-29 05:53:34 -07:00
Lixin Wei
a6a02779fe
[Core] remove verbose log from task execution (#18736) 2021-09-29 00:31:33 -07:00
matthewdeng
91a5f67261
[SGD] add share_cuda_visible_devices config flag (#18958) 2021-09-29 00:21:46 -07:00
Chu Xiangyang
505aa89d12
[Dashboard] Add start/end time for job (#18901) 2021-09-28 20:57:13 -07:00
Yi Cheng
16cf719aff
[core] hot fix of build failure (#18963) 2021-09-28 20:29:28 -07:00
Yi Cheng
96dff6e46d
[core] fix implicit merge conflict (#18961) 2021-09-28 19:18:54 -07:00
matthewdeng
83a28cc901
[client] update documentation with AWS security_group (#18905)
* [client] update documentation with AWS security_group

* Update doc/source/cluster/ray-client.rst

Co-authored-by: Ian Rodney <ian.rodney@gmail.com>

* lint

Co-authored-by: Ian Rodney <ian.rodney@gmail.com>
2021-09-28 16:47:46 -07:00
Eric Liang
4d763d3ffd
Increase metrics fetch timeout in autoscaler for large clusters 2021-09-28 15:24:44 -07:00
Edward Oakes
73b8936aa8
[runtime_env] Unify rpc::RuntimeEnv with serialized_runtime_env field (#18641) 2021-09-28 15:13:15 -05:00
Yi Cheng
4af07a8917
[rpc] cpu improvement of protobuf in gcs (#17933) 2021-09-28 11:47:19 -07:00
SangBin Cho
a0a02f4982
[Placement Group] Fix placement group high cpu usage part 1 (#18652) 2021-09-28 11:14:59 -07:00
Edward Oakes
96054953cc
[serve] Remove python_methods logic and raise an error dynamically instead (#18927) 2021-09-28 09:51:46 -07:00
Chris K. W
191af472ac
[client] remove ray_trace_ctx from kwargs if tracing disabled (#18926) 2021-09-28 09:47:43 -07:00
Ian Rodney
0d3544588e
[AutoRun] Fix Auto-Run for Client (#18457)
* remove old version

* auto init first attempt

* arg for fn decorator

* default to True

* ray.method should not autostart

* comments

* no auto init on global state fns

* tiny test fix

* quick tests

* respond to comments

* explain func

* fix comments

* forgot to save

* fix again

* fix reconnect tests

* fix medium tests

* fix workflows test

* Better fix for workflows
2021-09-28 08:00:26 -07:00
Yi Cheng
e3dd1e3751
Revert "Revert "[test] add unit test for PR #17634 (#18585)" (#18830)" (#18871)
* Revert "Revert "[test] add unit test for PR #17634 (#18585)" (#18830)"

This reverts commit 8dd3057644.

* up
2021-09-28 05:53:52 -07:00
Richard Liaw
227aa9e89b
[tune] change delimiter for results (#16573)
Signed-off-by: Richard Liaw <rliaw@berkeley.edu>
Co-authored-by: Kai Fricke <kai@anyscale.com>
2021-09-28 10:03:00 +01:00
Kai Fricke
6be87a3453
[tune] fix tune list-trials metric (#18914) 2021-09-28 09:59:32 +01:00
Chen Shen
057c425122
[Core][CoreWorker] call shutdown in the correct thread (#18910) 2021-09-28 01:29:47 -07:00
Chen Shen
62a73f4ce8
[nightly test][event] enable event logs in nightly tests (#18936) 2021-09-28 01:29:26 -07:00
Yi Cheng
e4f4c79252
Revert "[Serve] [doc] Improve runtime env doc (#18782)" (#18935)
This reverts commit d4d71985d5.
2021-09-27 21:52:13 -07:00
matthewdeng
d2caa00be8
[SGD] add SGDv2 survey link to docs (#18934) 2021-09-27 19:15:37 -07:00
Chen Shen
25d14cb4de
Ensure task_execution_service_ is destructed first (#18913) 2021-09-27 18:31:56 -07:00
Eric Liang
caf34a452c
Unify ArrowTensorType tables and Tensor blocks (#18867) 2021-09-27 16:24:09 -07:00
Maxim Egorushkin
be0133da1d
[Autoscaler][GCP] Allow Google Compute Engine instance templates. (#18620)
Co-authored-by: Maxim Egorushkin <maxim.egorushkin@gmail.com>
Co-authored-by: Ian <ian.rodney@gmail.com>
2021-09-27 16:08:41 -07:00
architkulkarni
d4d71985d5
[Serve] [doc] Improve runtime env doc (#18782) 2021-09-27 16:12:03 -05:00
Yi Cheng
994d156a0d
[doc] Mark multi-client as an experimental feature 2021-09-27 14:08:49 -07:00
Edward Oakes
627e475cc1
[docs] Don't recommend SignalActor from test_utils 2021-09-27 12:08:37 -07:00
Chen Shen
cbd7dc749c
[Core][CoreWorker] fix data race of exiting_ 2021-09-27 10:55:03 -07:00
Antoni Baum
72cc0c9bda
[SGDv2] Add Tune-Cifar-PyTorch-PBT example (#18860)
* [SGDv2] Add Tune-Cifar-PyTorch-PBT example

* Update python/ray/util/sgd/v2/BUILD

* Lint

* Update example

* Update docs
2021-09-27 09:22:40 -07:00
Jernej Makovsek
d6758ff92a
[tune] Fix HEBOSearch installation docs (#18861)
Co-authored-by: Antoni Baum <antoni.baum@protonmail.com>
2021-09-27 09:06:14 +01:00
Qing Wang
90d2456ec7
[Java] Support userloggers. (#18846)
Co-authored-by: Kai Yang <kfstorm@outlook.com>
2021-09-26 16:53:06 +08:00