Commit graph

9765 commits

Author SHA1 Message Date
SangBin Cho
55227a15b9
Handle retry to avoid statement timeout exception/ (#18968) 2021-09-29 23:04:35 -07:00
Clark Zinzow
74b5d3d8f7
[Datasets] Minimize truncation on balanced splits. (#18953)
* Minimize truncation on balanced splits.

* Refactor into subroutines.

* Feedback and fixes.
2021-09-29 21:57:08 -07:00
Alex Wu
5709c6501b
[dataset][usability] Dataset dependencies (#18346) 2021-09-29 17:29:31 -07:00
Yi Cheng
a993f3a262
[nightly] update nightly test for many node test 2021-09-29 17:28:44 -07:00
Clark Zinzow
73a6cda812
Handle empty datasets properly in most Dataset transformations. (#18983) 2021-09-29 17:27:03 -07:00
Eric Liang
aa985e1a9c
Fix false positive error message from autoscaler events (#18981) 2021-09-29 15:51:18 -07:00
Jiajun Yao
be29d27e8a
[Scalability Envelope] Include broadcast time in test_object_store result json (#18974) 2021-09-29 13:49:16 -07:00
Stephanie Wang
5eddaabd11
[core] Fix bug in dependency resolution for actor handles (#18862)
* x

* lint
2021-09-29 13:25:31 -07:00
Antoni Baum
573c66a755
[GCP] Update GCP TPU config (#18634)
* [autoscaler] Update GCP TPU config

* Preemptible by default

* Remove libtpu link from head node

* Workaround
2021-09-29 12:41:26 -07:00
Sven Mika
9c9b482661
[RLlib] Allow n-step > 1 and prio. replay for R2D2 and RNNSAC. (#18939) 2021-09-29 21:31:34 +02:00
Sven Mika
b99943806e
[RLlib] Add support for IMPALA to handle more than one loss/optimizer (analogous to recent enhancement for APPO). (#18971) 2021-09-29 21:30:04 +02:00
Jiajun Yao
ed9118393c
Listen to 127.0.0.1 by default on mac osx (#18904) 2021-09-29 11:40:19 -07:00
Eric Liang
3665c99896
Deflake test_failure_2.py::test_warning_for_infeasible_zero_cpu_actor 2021-09-29 11:39:16 -07:00
Dmitri Gekhtman
944309c017
Revert "[nightly] Deflaky nightly test many_nodes_actor_test (#18582)" (#18954)
* Revert "[nightly] Deflaky nightly test many_nodes_actor_test (#18582)"

This reverts commit fc6a739e4b.

* move to large test

Co-authored-by: Yi Cheng <chengyidna@gmail.com>
2021-09-29 11:02:14 -04:00
Jiajun Yao
35774fd399
[CI] Print out the mismatched commit in ci (#18956) 2021-09-29 15:48:57 +01:00
Chong-Li
42744f29ee
[GCS] Make Gcs-based actor scheduler's bookkeeping consistent (#18546)
* Make Gcs-based scheduler's bookkeeping consistent

* Remove this from lambda function

* Fix lambda function

* Trigger SchedulePendingActors

* Test for acquiring/releasing resources

* Reorganize structure

* Avoid overloading post

* Fix gcs_actor_manager_test

* Fix post counter and rename some func

* Fix unique_ptr

* Fix unique_ptr

* Fix book lint error

* Lint

Co-authored-by: Chong-Li <lc300133@antgroup.com>
2021-09-29 05:53:34 -07:00
Lixin Wei
a6a02779fe
[Core] remove verbose log from task execution (#18736) 2021-09-29 00:31:33 -07:00
matthewdeng
91a5f67261
[SGD] add share_cuda_visible_devices config flag (#18958) 2021-09-29 00:21:46 -07:00
Chu Xiangyang
505aa89d12
[Dashboard] Add start/end time for job (#18901) 2021-09-28 20:57:13 -07:00
Yi Cheng
16cf719aff
[core] hot fix of build failure (#18963) 2021-09-28 20:29:28 -07:00
Yi Cheng
96dff6e46d
[core] fix implicit merge conflict (#18961) 2021-09-28 19:18:54 -07:00
matthewdeng
83a28cc901
[client] update documentation with AWS security_group (#18905)
* [client] update documentation with AWS security_group

* Update doc/source/cluster/ray-client.rst

Co-authored-by: Ian Rodney <ian.rodney@gmail.com>

* lint

Co-authored-by: Ian Rodney <ian.rodney@gmail.com>
2021-09-28 16:47:46 -07:00
Eric Liang
4d763d3ffd
Increase metrics fetch timeout in autoscaler for large clusters 2021-09-28 15:24:44 -07:00
Edward Oakes
73b8936aa8
[runtime_env] Unify rpc::RuntimeEnv with serialized_runtime_env field (#18641) 2021-09-28 15:13:15 -05:00
Yi Cheng
4af07a8917
[rpc] cpu improvement of protobuf in gcs (#17933) 2021-09-28 11:47:19 -07:00
SangBin Cho
a0a02f4982
[Placement Group] Fix placement group high cpu usage part 1 (#18652) 2021-09-28 11:14:59 -07:00
Edward Oakes
96054953cc
[serve] Remove python_methods logic and raise an error dynamically instead (#18927) 2021-09-28 09:51:46 -07:00
Chris K. W
191af472ac
[client] remove ray_trace_ctx from kwargs if tracing disabled (#18926) 2021-09-28 09:47:43 -07:00
Ian Rodney
0d3544588e
[AutoRun] Fix Auto-Run for Client (#18457)
* remove old version

* auto init first attempt

* arg for fn decorator

* default to True

* ray.method should not autostart

* comments

* no auto init on global state fns

* tiny test fix

* quick tests

* respond to comments

* explain func

* fix comments

* forgot to save

* fix again

* fix reconnect tests

* fix medium tests

* fix workflows test

* Better fix for workflows
2021-09-28 08:00:26 -07:00
Yi Cheng
e3dd1e3751
Revert "Revert "[test] add unit test for PR #17634 (#18585)" (#18830)" (#18871)
* Revert "Revert "[test] add unit test for PR #17634 (#18585)" (#18830)"

This reverts commit 8dd3057644.

* up
2021-09-28 05:53:52 -07:00
Richard Liaw
227aa9e89b
[tune] change delimiter for results (#16573)
Signed-off-by: Richard Liaw <rliaw@berkeley.edu>
Co-authored-by: Kai Fricke <kai@anyscale.com>
2021-09-28 10:03:00 +01:00
Kai Fricke
6be87a3453
[tune] fix tune list-trials metric (#18914) 2021-09-28 09:59:32 +01:00
Chen Shen
057c425122
[Core][CoreWorker] call shutdown in the correct thread (#18910) 2021-09-28 01:29:47 -07:00
Chen Shen
62a73f4ce8
[nightly test][event] enable event logs in nightly tests (#18936) 2021-09-28 01:29:26 -07:00
Yi Cheng
e4f4c79252
Revert "[Serve] [doc] Improve runtime env doc (#18782)" (#18935)
This reverts commit d4d71985d5.
2021-09-27 21:52:13 -07:00
matthewdeng
d2caa00be8
[SGD] add SGDv2 survey link to docs (#18934) 2021-09-27 19:15:37 -07:00
Chen Shen
25d14cb4de
Ensure task_execution_service_ is destructed first (#18913) 2021-09-27 18:31:56 -07:00
Eric Liang
caf34a452c
Unify ArrowTensorType tables and Tensor blocks (#18867) 2021-09-27 16:24:09 -07:00
Maxim Egorushkin
be0133da1d
[Autoscaler][GCP] Allow Google Compute Engine instance templates. (#18620)
Co-authored-by: Maxim Egorushkin <maxim.egorushkin@gmail.com>
Co-authored-by: Ian <ian.rodney@gmail.com>
2021-09-27 16:08:41 -07:00
architkulkarni
d4d71985d5
[Serve] [doc] Improve runtime env doc (#18782) 2021-09-27 16:12:03 -05:00
Yi Cheng
994d156a0d
[doc] Mark multi-client as an experimental feature 2021-09-27 14:08:49 -07:00
Edward Oakes
627e475cc1
[docs] Don't recommend SignalActor from test_utils 2021-09-27 12:08:37 -07:00
Chen Shen
cbd7dc749c
[Core][CoreWorker] fix data race of exiting_ 2021-09-27 10:55:03 -07:00
Antoni Baum
72cc0c9bda
[SGDv2] Add Tune-Cifar-PyTorch-PBT example (#18860)
* [SGDv2] Add Tune-Cifar-PyTorch-PBT example

* Update python/ray/util/sgd/v2/BUILD

* Lint

* Update example

* Update docs
2021-09-27 09:22:40 -07:00
Jernej Makovsek
d6758ff92a
[tune] Fix HEBOSearch installation docs (#18861)
Co-authored-by: Antoni Baum <antoni.baum@protonmail.com>
2021-09-27 09:06:14 +01:00
Qing Wang
90d2456ec7
[Java] Support userloggers. (#18846)
Co-authored-by: Kai Yang <kfstorm@outlook.com>
2021-09-26 16:53:06 +08:00
Chen Shen
aaae8c122b
Fix pid None in logs (#18898) 2021-09-25 17:07:32 -07:00
Eric Liang
1ffec66cd5
[CI] Disable test_reference_counting on Windows build 2021-09-25 13:40:00 -07:00
mwtian
66aac2e219
[C++] Use RayConfig to read internal environment variables only once (#18869)
* store environ on first access

* fix

* Use RayConfig

* fix

* fix

* Revert removal of headers. They are actually used.

* rename

* fix lint

* format

* use std::getenv()

* fix
2021-09-25 12:27:42 -07:00
Yi Cheng
65fa740c3b
Fix minor doc error (#18896) 2021-09-24 22:21:10 -07:00