Commit graph

9706 commits

Author SHA1 Message Date
Jiajun Yao
7588bfd315
[Lint] Add flake8-bugbear (#19053)
* Add flake8-bugbear

* Add flake8-bugbear
2021-10-03 23:24:11 -07:00
Jiajun Yao
2b44e9a3e1
Increase disk for long running tests (#19064) 2021-10-03 22:52:44 -07:00
Jiajun Yao
b8ef4f0a34
[CI] Add a retry helper to e2e.py (#19045) 2021-10-02 09:54:41 -07:00
Siyuan (Ryans) Zhuang
28d905dcb0
[Workflow] Move arguments into workflow step context (#19003)
* refactor

* improve documentation

* fix comments

* Use dataclass for workflow context

* update docs
2021-10-01 23:48:57 -07:00
Eric Liang
032a420ee6
Rename Dataset.pipeline to Dataset.window (#19050) 2021-10-01 19:55:29 -07:00
Kai Fricke
3dc176c42e
[ci/tune] Add SGD and Tune GPU pipeline step to CI (#18469)
* [ci/tune] Add Tune GPU pipeline step to CI

* cont.

* add sgd gpu tests

* format yaml, fix imports

* install horovod; fix line wrapping

* set GPU per worker to 0.5

* fix import

* move test to 4gpu machine

* fix lint

* lint

* set visible devices

* pull in tf gpu fix

* Fix Tune GPU pipeline step

* nit

* Disable GPU tests until we have some

* Re-add empty rllib tests

Co-authored-by: Matthew Deng <matthew.j.deng@gmail.com>
2021-10-01 18:34:05 -07:00
Simon Mo
9b2a368c8c
[Runtime Env] Implement basic runtime env plugin mechanism (#19044) 2021-10-01 17:22:54 -07:00
Edward Oakes
cac6f9d75c
skip test on windows (#19047) 2021-10-01 15:56:37 -07:00
Ian Rodney
a4ebe2697c
[Autoscaler] Improve assert_called (#19036)
* improvements

* fix invocations

* improve not_has_call
2021-10-01 14:08:31 -07:00
Clark Zinzow
d22f838795
[Datasets] Delineate between ref and raw APIs for the Pandas/Arrow integrations. (#18992) 2021-10-01 13:08:25 -07:00
Frank Luan
f885060efa
Disable distributed sort test on Windows (#19041)
* [WIP] Sorting benchmark

* Separate num_mappers and num_reducers

* Add tests

* Fix tests

* Tracing

* Separate num_mappers and num_reducers

* Two-stage reduce

* Back pressure to avoid excessive spilling

* Make merger_concurrency an option

* Fix tests

* Tweaks

* Remote writers

* Format

* WIP

* Address comments

* Fix tests and address comments

* Lint

* Fix mount points for testing

* Simplify code path

* Address comments

* Disable distributed sort test on Windows
2021-10-01 12:17:28 -07:00
mwtian
56debfc063
[Object manager] fix comments 2021-10-01 11:42:07 -07:00
Stephanie Wang
c052395f4e
[core] Remove "plasma promotion" for serialized ObjectRefs 2021-10-01 10:39:55 -07:00
architkulkarni
b0a5564f4e
[Serve] Integrate metrics with minimal autoscaling algorithm and add e2e test (#18793) 2021-10-01 10:21:12 -07:00
Antoni Baum
cc3199b814
[docs] Provide information about resource deadlocks, early stopping in Tune docs (#18947)
Co-authored-by: Kai Fricke <krfricke@users.noreply.github.com>
2021-10-01 13:52:47 +01:00
Dmitri Gekhtman
bfd706aea3
[test][k8s] Restore kubernetes test directory, adds some info (#18982) 2021-10-01 11:23:22 +01:00
Tom Birch
aa0cab5cae
Don't export absl symbols as they collide with tensorflow (#18870)
Co-authored-by: Tom Birch <tom@powerlinespro.com>
2021-10-01 13:20:59 +08:00
mwtian
49a57aa477
[Scheduling] Report resource demand for infeasible 1-CPU tasks (#19000) 2021-09-30 22:03:02 -07:00
Jiajun Yao
d64872dd67
Fix python mutable default argument anti-pattern (#19028) 2021-10-01 13:05:02 +09:00
mwtian
f6c1a12ffa
[Lint] update clang-tidy rules (#19025) 2021-09-30 20:12:30 -07:00
Edward Oakes
8e5d48d668
[runtime_env] Remove deprecated override_environment_variables and worker_env fields (#18213) 2021-09-30 18:55:24 -05:00
Jiajun Yao
81b052f222
[core] Fix port collision between metrics agent port and metrics export port (#19016) 2021-09-30 16:15:42 -07:00
Ian Rodney
02d1f659ba
[Workflows] Use RAY_ADDRESS in Tests (#19012) 2021-09-30 13:05:51 -07:00
Chris K. W
61d058fe66
[client] skip test_wrapped_actor_creation on windows (#19013)
* skip test_wrapped_actor_creation on windows

* rerun windows ci
2021-09-30 13:04:43 -07:00
Frank Luan
732af42ae9
[Sort benchmark] Two-stage reduce (#17055)
* [WIP] Sorting benchmark

* Separate num_mappers and num_reducers

* Add tests

* Fix tests

* Tracing

* Separate num_mappers and num_reducers

* Two-stage reduce

* Back pressure to avoid excessive spilling

* Make merger_concurrency an option

* Fix tests

* Tweaks

* Remote writers

* Format

* WIP

* Address comments

* Fix tests and address comments

* Lint

* Fix mount points for testing

* Simplify code path

* Address comments
2021-09-30 12:39:11 -07:00
Sven Mika
16ad46a654
[RLlib] Fix broken test_r2d2.py. (#19017) 2021-09-30 21:19:37 +02:00
Simon Mo
301312e77f
Fix windows build environment breakage (#19019) 2021-09-30 11:58:48 -07:00
architkulkarni
8af9646cb0
[Doc] [runtime env] Remove delta caching remark and state Client+@remote limitation (#19010) 2021-09-30 13:29:50 -05:00
architkulkarni
0f0b161ea1
Revert "Revert "[Serve] [doc] Improve runtime env doc"" (#18943)
* Revert "Revert "[Serve] [doc] Improve runtime env doc (#18782)" (#18935)"

This reverts commit e4f4c79252.
2021-09-30 13:28:44 -05:00
Clark Zinzow
e384a6c91f
(TaskPool) Cancel all transformation tasks when one task fails or when SIGINT is received. (#18991) 2021-09-30 10:56:30 -07:00
gjoliver
e61f2c72d7
Upgrade bazel version to 4.2.1 (#18996) 2021-09-30 10:50:54 -07:00
mwtian
d12e35ce53
[Object manager] don't abort entire pull request on race condition in concurrent chunk receive (#18955) 2021-09-30 10:19:54 -07:00
Simon Mo
910553c3bb
[Core] Add private method to retrieve current task queue length (#18964) 2021-09-30 09:20:04 -07:00
Amog Kamsetty
98ac3f601c
[SGD] v1 to v2 Migration Guide (#18887)
* wip

* add guide

* fix test

* address comments

* add to docs

* fix

* remove markdown

* add warning to all pages

* formatting

* fix

* links

* Update doc/source/raysgd/v2/migration-guide.rst

Co-authored-by: matthewdeng <matthew.j.deng@gmail.com>

* Update doc/source/raysgd/v2/migration-guide.rst

Co-authored-by: matthewdeng <matthew.j.deng@gmail.com>

* Update doc/source/raysgd/v2/migration-guide.rst

Co-authored-by: matthewdeng <matthew.j.deng@gmail.com>

* Update doc/source/raysgd/v2/migration-guide.rst

Co-authored-by: matthewdeng <matthew.j.deng@gmail.com>

* Update doc/source/raysgd/v2/migration-guide.rst

Co-authored-by: matthewdeng <matthew.j.deng@gmail.com>

* address comments

* address comments

* fix

* address comments

Co-authored-by: matthewdeng <matthew.j.deng@gmail.com>
2021-09-30 09:15:21 -07:00
architkulkarni
bf6e50813c
[runtime env] Parse local pip/conda requirements files locally upon task/actor definition (#18988) 2021-09-30 09:47:15 -05:00
Sven Mika
ac3371a148
[RLlib] Discussion 3644: Fix bug for complex obs spaces containing Box([2D shape]) and discrete component. (#18917) 2021-09-30 16:39:38 +02:00
Sven Mika
ed85f59194
[RLlib] Unify all RLlib Trainer.train() -> results[info][learner][policy ID][learner_stats] and add structure tests. (#18879) 2021-09-30 16:39:05 +02:00
Sven Mika
828f5d26b7
[RLlib] Custom view requirements (e.g. for prev-n-obs) work with compute_single_action and compute_actions_from_input_dict. (#18921) 2021-09-30 15:03:37 +02:00
Avnish Narayan
6dc1a6b72f
[RLlib] Raise error for kl penalty ddpo (#18959)
* [RLlib] Raise error for kl penalty ddpo

DDPPO doesn't support KL penalties like PPO-1.
In order to support KL penalties, DDPPO would need to
become undecentralized, which defeats the purpose of the
algorithm. Users can still tune the entropy coefficient to
control the policy entropy (similar to controlling the KL
penalty.)

* Update rllib/agents/ppo/ddppo.py

Co-authored-by: avnishn <avnishnarayan@gmail.com>
Co-authored-by: Sven Mika <sven@anyscale.io>
2021-09-30 10:56:22 +02:00
Chris K. W
291fd36dee
_ray_trace_ctx fix follow-up (#18950)
* sanity check

* add test case

* fix assert

* refactor

* check kwargs instead of _kwargs

* format
2021-09-29 23:53:04 -07:00
Sven Mika
05a55a9335
[RLlib] Issue 18668: Unity3D env client/server example not working (fix + add to test cases). (#18942) 2021-09-30 08:30:20 +02:00
SangBin Cho
55227a15b9
Handle retry to avoid statement timeout exception/ (#18968) 2021-09-29 23:04:35 -07:00
Clark Zinzow
74b5d3d8f7
[Datasets] Minimize truncation on balanced splits. (#18953)
* Minimize truncation on balanced splits.

* Refactor into subroutines.

* Feedback and fixes.
2021-09-29 21:57:08 -07:00
Alex Wu
5709c6501b
[dataset][usability] Dataset dependencies (#18346) 2021-09-29 17:29:31 -07:00
Yi Cheng
a993f3a262
[nightly] update nightly test for many node test 2021-09-29 17:28:44 -07:00
Clark Zinzow
73a6cda812
Handle empty datasets properly in most Dataset transformations. (#18983) 2021-09-29 17:27:03 -07:00
Eric Liang
aa985e1a9c
Fix false positive error message from autoscaler events (#18981) 2021-09-29 15:51:18 -07:00
Jiajun Yao
be29d27e8a
[Scalability Envelope] Include broadcast time in test_object_store result json (#18974) 2021-09-29 13:49:16 -07:00
Stephanie Wang
5eddaabd11
[core] Fix bug in dependency resolution for actor handles (#18862)
* x

* lint
2021-09-29 13:25:31 -07:00
Antoni Baum
573c66a755
[GCP] Update GCP TPU config (#18634)
* [autoscaler] Update GCP TPU config

* Preemptible by default

* Remove libtpu link from head node

* Workaround
2021-09-29 12:41:26 -07:00