Commit graph

9694 commits

Author SHA1 Message Date
Stephanie Wang
c052395f4e
[core] Remove "plasma promotion" for serialized ObjectRefs 2021-10-01 10:39:55 -07:00
architkulkarni
b0a5564f4e
[Serve] Integrate metrics with minimal autoscaling algorithm and add e2e test (#18793) 2021-10-01 10:21:12 -07:00
Antoni Baum
cc3199b814
[docs] Provide information about resource deadlocks, early stopping in Tune docs (#18947)
Co-authored-by: Kai Fricke <krfricke@users.noreply.github.com>
2021-10-01 13:52:47 +01:00
Dmitri Gekhtman
bfd706aea3
[test][k8s] Restore kubernetes test directory, adds some info (#18982) 2021-10-01 11:23:22 +01:00
Tom Birch
aa0cab5cae
Don't export absl symbols as they collide with tensorflow (#18870)
Co-authored-by: Tom Birch <tom@powerlinespro.com>
2021-10-01 13:20:59 +08:00
mwtian
49a57aa477
[Scheduling] Report resource demand for infeasible 1-CPU tasks (#19000) 2021-09-30 22:03:02 -07:00
Jiajun Yao
d64872dd67
Fix python mutable default argument anti-pattern (#19028) 2021-10-01 13:05:02 +09:00
mwtian
f6c1a12ffa
[Lint] update clang-tidy rules (#19025) 2021-09-30 20:12:30 -07:00
Edward Oakes
8e5d48d668
[runtime_env] Remove deprecated override_environment_variables and worker_env fields (#18213) 2021-09-30 18:55:24 -05:00
Jiajun Yao
81b052f222
[core] Fix port collision between metrics agent port and metrics export port (#19016) 2021-09-30 16:15:42 -07:00
Ian Rodney
02d1f659ba
[Workflows] Use RAY_ADDRESS in Tests (#19012) 2021-09-30 13:05:51 -07:00
Chris K. W
61d058fe66
[client] skip test_wrapped_actor_creation on windows (#19013)
* skip test_wrapped_actor_creation on windows

* rerun windows ci
2021-09-30 13:04:43 -07:00
Frank Luan
732af42ae9
[Sort benchmark] Two-stage reduce (#17055)
* [WIP] Sorting benchmark

* Separate num_mappers and num_reducers

* Add tests

* Fix tests

* Tracing

* Separate num_mappers and num_reducers

* Two-stage reduce

* Back pressure to avoid excessive spilling

* Make merger_concurrency an option

* Fix tests

* Tweaks

* Remote writers

* Format

* WIP

* Address comments

* Fix tests and address comments

* Lint

* Fix mount points for testing

* Simplify code path

* Address comments
2021-09-30 12:39:11 -07:00
Sven Mika
16ad46a654
[RLlib] Fix broken test_r2d2.py. (#19017) 2021-09-30 21:19:37 +02:00
Simon Mo
301312e77f
Fix windows build environment breakage (#19019) 2021-09-30 11:58:48 -07:00
architkulkarni
8af9646cb0
[Doc] [runtime env] Remove delta caching remark and state Client+@remote limitation (#19010) 2021-09-30 13:29:50 -05:00
architkulkarni
0f0b161ea1
Revert "Revert "[Serve] [doc] Improve runtime env doc"" (#18943)
* Revert "Revert "[Serve] [doc] Improve runtime env doc (#18782)" (#18935)"

This reverts commit e4f4c79252.
2021-09-30 13:28:44 -05:00
Clark Zinzow
e384a6c91f
(TaskPool) Cancel all transformation tasks when one task fails or when SIGINT is received. (#18991) 2021-09-30 10:56:30 -07:00
gjoliver
e61f2c72d7
Upgrade bazel version to 4.2.1 (#18996) 2021-09-30 10:50:54 -07:00
mwtian
d12e35ce53
[Object manager] don't abort entire pull request on race condition in concurrent chunk receive (#18955) 2021-09-30 10:19:54 -07:00
Simon Mo
910553c3bb
[Core] Add private method to retrieve current task queue length (#18964) 2021-09-30 09:20:04 -07:00
Amog Kamsetty
98ac3f601c
[SGD] v1 to v2 Migration Guide (#18887)
* wip

* add guide

* fix test

* address comments

* add to docs

* fix

* remove markdown

* add warning to all pages

* formatting

* fix

* links

* Update doc/source/raysgd/v2/migration-guide.rst

Co-authored-by: matthewdeng <matthew.j.deng@gmail.com>

* Update doc/source/raysgd/v2/migration-guide.rst

Co-authored-by: matthewdeng <matthew.j.deng@gmail.com>

* Update doc/source/raysgd/v2/migration-guide.rst

Co-authored-by: matthewdeng <matthew.j.deng@gmail.com>

* Update doc/source/raysgd/v2/migration-guide.rst

Co-authored-by: matthewdeng <matthew.j.deng@gmail.com>

* Update doc/source/raysgd/v2/migration-guide.rst

Co-authored-by: matthewdeng <matthew.j.deng@gmail.com>

* address comments

* address comments

* fix

* address comments

Co-authored-by: matthewdeng <matthew.j.deng@gmail.com>
2021-09-30 09:15:21 -07:00
architkulkarni
bf6e50813c
[runtime env] Parse local pip/conda requirements files locally upon task/actor definition (#18988) 2021-09-30 09:47:15 -05:00
Sven Mika
ac3371a148
[RLlib] Discussion 3644: Fix bug for complex obs spaces containing Box([2D shape]) and discrete component. (#18917) 2021-09-30 16:39:38 +02:00
Sven Mika
ed85f59194
[RLlib] Unify all RLlib Trainer.train() -> results[info][learner][policy ID][learner_stats] and add structure tests. (#18879) 2021-09-30 16:39:05 +02:00
Sven Mika
828f5d26b7
[RLlib] Custom view requirements (e.g. for prev-n-obs) work with compute_single_action and compute_actions_from_input_dict. (#18921) 2021-09-30 15:03:37 +02:00
Avnish Narayan
6dc1a6b72f
[RLlib] Raise error for kl penalty ddpo (#18959)
* [RLlib] Raise error for kl penalty ddpo

DDPPO doesn't support KL penalties like PPO-1.
In order to support KL penalties, DDPPO would need to
become undecentralized, which defeats the purpose of the
algorithm. Users can still tune the entropy coefficient to
control the policy entropy (similar to controlling the KL
penalty.)

* Update rllib/agents/ppo/ddppo.py

Co-authored-by: avnishn <avnishnarayan@gmail.com>
Co-authored-by: Sven Mika <sven@anyscale.io>
2021-09-30 10:56:22 +02:00
Chris K. W
291fd36dee
_ray_trace_ctx fix follow-up (#18950)
* sanity check

* add test case

* fix assert

* refactor

* check kwargs instead of _kwargs

* format
2021-09-29 23:53:04 -07:00
Sven Mika
05a55a9335
[RLlib] Issue 18668: Unity3D env client/server example not working (fix + add to test cases). (#18942) 2021-09-30 08:30:20 +02:00
SangBin Cho
55227a15b9
Handle retry to avoid statement timeout exception/ (#18968) 2021-09-29 23:04:35 -07:00
Clark Zinzow
74b5d3d8f7
[Datasets] Minimize truncation on balanced splits. (#18953)
* Minimize truncation on balanced splits.

* Refactor into subroutines.

* Feedback and fixes.
2021-09-29 21:57:08 -07:00
Alex Wu
5709c6501b
[dataset][usability] Dataset dependencies (#18346) 2021-09-29 17:29:31 -07:00
Yi Cheng
a993f3a262
[nightly] update nightly test for many node test 2021-09-29 17:28:44 -07:00
Clark Zinzow
73a6cda812
Handle empty datasets properly in most Dataset transformations. (#18983) 2021-09-29 17:27:03 -07:00
Eric Liang
aa985e1a9c
Fix false positive error message from autoscaler events (#18981) 2021-09-29 15:51:18 -07:00
Jiajun Yao
be29d27e8a
[Scalability Envelope] Include broadcast time in test_object_store result json (#18974) 2021-09-29 13:49:16 -07:00
Stephanie Wang
5eddaabd11
[core] Fix bug in dependency resolution for actor handles (#18862)
* x

* lint
2021-09-29 13:25:31 -07:00
Antoni Baum
573c66a755
[GCP] Update GCP TPU config (#18634)
* [autoscaler] Update GCP TPU config

* Preemptible by default

* Remove libtpu link from head node

* Workaround
2021-09-29 12:41:26 -07:00
Sven Mika
9c9b482661
[RLlib] Allow n-step > 1 and prio. replay for R2D2 and RNNSAC. (#18939) 2021-09-29 21:31:34 +02:00
Sven Mika
b99943806e
[RLlib] Add support for IMPALA to handle more than one loss/optimizer (analogous to recent enhancement for APPO). (#18971) 2021-09-29 21:30:04 +02:00
Jiajun Yao
ed9118393c
Listen to 127.0.0.1 by default on mac osx (#18904) 2021-09-29 11:40:19 -07:00
Eric Liang
3665c99896
Deflake test_failure_2.py::test_warning_for_infeasible_zero_cpu_actor 2021-09-29 11:39:16 -07:00
Dmitri Gekhtman
944309c017
Revert "[nightly] Deflaky nightly test many_nodes_actor_test (#18582)" (#18954)
* Revert "[nightly] Deflaky nightly test many_nodes_actor_test (#18582)"

This reverts commit fc6a739e4b.

* move to large test

Co-authored-by: Yi Cheng <chengyidna@gmail.com>
2021-09-29 11:02:14 -04:00
Jiajun Yao
35774fd399
[CI] Print out the mismatched commit in ci (#18956) 2021-09-29 15:48:57 +01:00
Chong-Li
42744f29ee
[GCS] Make Gcs-based actor scheduler's bookkeeping consistent (#18546)
* Make Gcs-based scheduler's bookkeeping consistent

* Remove this from lambda function

* Fix lambda function

* Trigger SchedulePendingActors

* Test for acquiring/releasing resources

* Reorganize structure

* Avoid overloading post

* Fix gcs_actor_manager_test

* Fix post counter and rename some func

* Fix unique_ptr

* Fix unique_ptr

* Fix book lint error

* Lint

Co-authored-by: Chong-Li <lc300133@antgroup.com>
2021-09-29 05:53:34 -07:00
Lixin Wei
a6a02779fe
[Core] remove verbose log from task execution (#18736) 2021-09-29 00:31:33 -07:00
matthewdeng
91a5f67261
[SGD] add share_cuda_visible_devices config flag (#18958) 2021-09-29 00:21:46 -07:00
Chu Xiangyang
505aa89d12
[Dashboard] Add start/end time for job (#18901) 2021-09-28 20:57:13 -07:00
Yi Cheng
16cf719aff
[core] hot fix of build failure (#18963) 2021-09-28 20:29:28 -07:00
Yi Cheng
96dff6e46d
[core] fix implicit merge conflict (#18961) 2021-09-28 19:18:54 -07:00