Commit graph

2150 commits

Author SHA1 Message Date
Sven Mika
510c850651
[RLlib] SAC add discrete action support. (#7320)
* Exploration API (+EpsilonGreedy sub-class).

* Exploration API (+EpsilonGreedy sub-class).

* Cleanup/LINT.

* Add `deterministic` to generic Trainer config (NOTE: this is still ignored by most Agents).

* Add `error` option to deprecation_warning().

* WIP.

* Bug fix: Get exploration-info for tf framework.
Bug fix: Properly deprecate some DQN config keys.

* WIP.

* LINT.

* WIP.

* Split PerWorkerEpsilonGreedy out of EpsilonGreedy.
Docstrings.

* Fix bug in sampler.py in case Policy has self.exploration = None

* Update rllib/agents/dqn/dqn.py

Co-Authored-By: Eric Liang <ekhliang@gmail.com>

* WIP.

* Update rllib/agents/trainer.py

Co-Authored-By: Eric Liang <ekhliang@gmail.com>

* WIP.

* Change requests.

* LINT

* In tune/utils/util.py::deep_update() Only keep deep_updat'ing if both original and value are dicts. If value is not a dict, set

* Completely obsolete syn_replay_optimizer.py's parameters schedule_max_timesteps AND beta_annealing_fraction (replaced with prioritized_replay_beta_annealing_timesteps).

* Update rllib/evaluation/worker_set.py

Co-Authored-By: Eric Liang <ekhliang@gmail.com>

* Review fixes.

* Fix default value for DQN's exploration spec.

* LINT

* Fix recursion bug (wrong parent c'tor).

* Do not pass timestep to get_exploration_info.

* Update tf_policy.py

* Fix some remaining issues with test cases and remove more deprecated DQN/APEX exploration configs.

* Bug fix tf-action-dist

* DDPG incompatibility bug fix with new DQN exploration handling (which is imported by DDPG).

* Switch off exploration when getting action probs from off-policy-estimator's policy.

* LINT

* Fix test_checkpoint_restore.py.

* Deprecate all SAC exploration (unused) configs.

* Properly use `model.last_output()` everywhere. Instead of `model._last_output`.

* WIP.

* Take out set_epsilon from multi-agent-env test (not needed, decays anyway).

* WIP.

* Trigger re-test (flaky checkpoint-restore test).

* WIP.

* WIP.

* Add test case for deterministic action sampling in PPO.

* bug fix.

* Added deterministic test cases for different Agents.

* Fix problem with TupleActions in dynamic-tf-policy.

* Separate supported_spaces tests so they can be run separately for easier debugging.

* LINT.

* Fix autoregressive_action_dist.py test case.

* Re-test.

* Fix.

* Remove duplicate py_test rule from bazel.

* LINT.

* WIP.

* WIP.

* SAC fix.

* SAC fix.

* WIP.

* WIP.

* WIP.

* FIX 2 examples tests.

* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* Fix.

* LINT.

* Renamed test file.

* WIP.

* Add unittest.main.

* Make action_dist_class mandatory.

* fix

* FIX.

* WIP.

* WIP.

* Fix.

* Fix.

* Fix explorations test case (contextlib cannot find its own nullcontext??).

* Force torch to be installed for QMIX.

* LINT.

* Fix determine_tests_to_run.py.

* Fix determine_tests_to_run.py.

* WIP

* Add Random exploration component to tests (fixed issue with "static-graph randomness" via py_function).

* Add Random exploration component to tests (fixed issue with "static-graph randomness" via py_function).

* Rename some stuff.

* Rename some stuff.

* WIP.

* update.

* WIP.

* Gumbel Softmax Dist.

* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* WIP

* WIP.

* WIP.

* Hypertune.

* Hypertune.

* Hypertune.

* Lock-in.

* Cleanup.

* LINT.

* Fix.

* Update rllib/policy/eager_tf_policy.py

Co-Authored-By: Kristian Hartikainen <kristian.hartikainen@gmail.com>

* Update rllib/agents/sac/sac_policy.py

Co-Authored-By: Kristian Hartikainen <kristian.hartikainen@gmail.com>

* Update rllib/agents/sac/sac_policy.py

Co-Authored-By: Kristian Hartikainen <kristian.hartikainen@gmail.com>

* Update rllib/models/tf/tf_action_dist.py

Co-Authored-By: Kristian Hartikainen <kristian.hartikainen@gmail.com>

* Update rllib/models/tf/tf_action_dist.py

Co-Authored-By: Kristian Hartikainen <kristian.hartikainen@gmail.com>

* Fix items from review comments.

* Add dm_tree to RLlib dependencies.

* Add dm_tree to RLlib dependencies.

* Fix DQN test cases ((Torch)Categorical).

* Fix wrong pip install.

Co-authored-by: Eric Liang <ekhliang@gmail.com>
Co-authored-by: Kristian Hartikainen <kristian.hartikainen@gmail.com>
2020-03-06 10:37:12 -08:00
Eric Liang
476b5c6196
[Parallel Iterators] Allow for operator chaining after repartition (#7268)
* bug fix repartition

* change add_transform from private to inner

* formatting

* addressing comments

* formatting
2020-03-04 14:42:52 -08:00
Philipp Moritz
de0c99876e
Fix fate_share not being passed to Redis shards (#7432) 2020-03-04 11:29:45 -08:00
Edward Oakes
0abcca258f
Add entries to in-memory store on Put() (#7085) 2020-03-04 10:17:27 -08:00
Philipp Moritz
fb1c1e2d27
Revert "Keep cloudpickle up-to-date with the upstream (#7406)" (#7437)
This reverts commit f6883bf725.
2020-03-03 18:36:15 -08:00
Maksim Smolin
3a134c7224
[RaySGD] Rename PyTorch API endpoints to start with Torch (#7425)
* Start renaming pytorch to torch

* Rename PyTorchTrainer to TorchTrainer

* Rename PyTorch runners to Torch runners

* Finish renaming API

* Rename to torch in tests

* Finish renaming docs + tests

* Run format + fix DeprecationWarning

* fix

* move tests up

* rename

Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
2020-03-03 16:44:42 -08:00
Siyuan (Ryans) Zhuang
f6883bf725
Keep cloudpickle up-to-date with the upstream (#7406) 2020-03-03 13:52:54 -08:00
Edward Oakes
b0bf5450c2
Fix flaky multiprocessing tests (#7413) 2020-03-03 15:07:59 -06:00
ijrsvt
fb76092d75
Re-route asyncio plasma code path through raylet instead of direct plasma connection (#7234) 2020-03-03 15:43:46 -05:00
Edward Oakes
04ec599441
Use ray.kill() in multiprocessing.Pool (#7409) 2020-03-03 12:49:13 -06:00
Allen
b74eb5fce6
Capture output for commands run by the autoscaler (#7381) 2020-03-03 10:19:21 -08:00
mehrdadn
4d42664b2a
Use prctl(PR_SET_PDEATHSIG) on Linux instead of reaper (#7150) 2020-03-03 11:45:42 -06:00
ijrsvt
584645cc7d
Fix Experimental Async API (#7391) 2020-03-02 22:24:20 -06:00
Edward Oakes
580b017b43
Fix flaky global GC tests (#7407) 2020-03-02 21:03:01 -06:00
Edward Oakes
9e9f1962c7
Enable test_actor_pool in CI (#7405) 2020-03-02 20:24:36 -06:00
Edward Oakes
2b6f00724a
Enable test_joblib in CI (#7404) 2020-03-02 20:03:27 -06:00
Edward Oakes
d69fe54f6d
Temporarily skip testEndToEndReporting (#7402) 2020-03-02 18:27:34 -06:00
Siyuan (Ryans) Zhuang
0792b5cb93
Fix the numpy ndarray subclass serialization bug (#7392) 2020-03-01 23:05:59 -08:00
Richard Liaw
48cdca843f
[raysgd] Custom training operator (#7211) 2020-03-01 21:22:48 -08:00
Eric Liang
3c6b94f3f5
[rllib] Enable performance metrics reporting for RLlib pipelines, add A3C (#7299) 2020-02-28 16:44:17 -08:00
Richard Liaw
fb73d51d4d
[tune] fix hparams for tbx (#7312)
* fix

* test_hist

* remove unnecessary value check

* pbt

* queue

* skip_for_now

* Apply suggestions from code review
2020-02-28 11:51:56 -08:00
Richard Liaw
ca40b0fcc6
[tune][minor] Avoid throwing error when gpu check fails (#7362) 2020-02-28 11:32:44 -08:00
Edward Oakes
f321eaec9b
Working but not passing test (#7358) 2020-02-28 12:57:28 -06:00
mehrdadn
fb0bc7b947
Partially revert "[Core/RLlib] Move log_once from rllib to ray.util. (#7273)" (#7361)
This partially reverts commit 357232d124.

The addition of python/__init__.py broke the build on Windows. However, this is difficult to notice because Bazel doesn't seem to notice this dependency. You first have to go to a commit that fails on this issue, and then try to re-build this commit, so that Bazel actually performs a rebuild.

A useful command-line for triggering the exact build i:

bazel build --compile_one_dependency //:python/ray/_raylet.pyx
2020-02-28 10:27:45 -08:00
Edward Oakes
93fe4b0b58
Change actor.__ray_kill__() to ray.kill(actor) (#7360) 2020-02-28 11:55:13 -06:00
Richard Liaw
3fc162f93c
[tune] Add Unit Test for nested PBT + Jenkins (#7324) 2020-02-27 18:17:11 -08:00
mehrdadn
8730996682
Windows changes (#7315) 2020-02-27 15:14:10 -08:00
Edward Oakes
ced062319d
Decrease test_object_manager put size to avoid OOMs in CI (#7355) 2020-02-27 11:08:10 -08:00
Edward Oakes
cbf55d69a6
Remove serialized from_random object ids in tests (#7340) 2020-02-27 11:04:06 -08:00
Edward Oakes
bd9411f849
Call TriggerGlobalGC when the plasma store is full (#7337) 2020-02-27 11:01:49 -08:00
Sven Mika
357232d124
[Core/RLlib] Move log_once from rllib to ray.util. (#7273)
* Move log_once from rllib to tune.

* Move log_once from rllib to tune.

* LINT.

* Move to ray.util.debug.
2020-02-27 10:40:44 -08:00
Edward Oakes
d9027acaf2
Deprecate non-direct-call API (#7336) 2020-02-27 10:37:23 -08:00
Edward Oakes
55ccfb6089
Fix asyncio actor race condition (#7335) 2020-02-27 10:16:04 -08:00
Edward Oakes
ee0f71e398
Add __commit__ field to ray package in wheels (#7305) 2020-02-26 17:54:22 -08:00
Edward Oakes
2ad9bc5684
Move plasma retry logic into plasma store provider (#7328) 2020-02-26 16:57:02 -08:00
Eric Liang
b310661338
Add internal_api.global_gc() method, which triggers gc.collect() on all workers (#7327) 2020-02-26 14:09:29 -08:00
Stephanie Wang
9964657815
Fix plasma bug (#7322) 2020-02-25 18:15:28 -08:00
Edward Oakes
44b4394afa
Remove unused AddContainedObjectIDs (#7323) 2020-02-25 16:42:20 -08:00
Richard Liaw
226fcd5aff
Add Dashboard and Util to setup-dev (#7321) 2020-02-25 15:25:09 -08:00
Eric Liang
1ea05a2c08
[tune] Fix a number of reporter regressions and add end-to-end tests (#7274) 2020-02-25 14:31:56 -08:00
Eric Liang
f14b6e477b
Raise gRPC message size limit to 100MB (#7269) 2020-02-24 23:22:49 -08:00
Edward Oakes
f2faf8d26e
Fix passing duplicate by-reference arguments (#7306) 2020-02-24 19:18:16 -08:00
chaokunyang
8b6784de06
[Streaming] Streaming Python API (#6755) 2020-02-25 10:33:33 +08:00
Mitchell Stern
669bb403c3
Add TypeScript and HTML linting to Travis lint job (#7294) 2020-02-24 11:12:07 -08:00
Eric Liang
0ae4fe020d
revert omp threads fix (#7288) 2020-02-23 21:26:49 -08:00
fangfengbin
e7d0ec9531
Enable GCS server when running python unit tests (#7101)
* Enable GCS server when running python unit tests

* restart ci

* restart ci

* fix code style

* restart ci

* restart ci

* restart ci

* restart ci

* restart ci

* Define RAY_GCS_SERVICE_ENABLED as a constant

* fix review comments

* fix code style

* fix code style

* fix code style

* fix code style

* fix review comments

* add gcs service python testcase

* fix TESTSUITE name bug
2020-02-24 09:48:40 +08:00
Sven Mika
0db2046b0a
[RLlib] Policy.compute_log_likelihoods() and SAC refactor. (issue #7107) (#7124)
* Exploration API (+EpsilonGreedy sub-class).

* Exploration API (+EpsilonGreedy sub-class).

* Cleanup/LINT.

* Add `deterministic` to generic Trainer config (NOTE: this is still ignored by most Agents).

* Add `error` option to deprecation_warning().

* WIP.

* Bug fix: Get exploration-info for tf framework.
Bug fix: Properly deprecate some DQN config keys.

* WIP.

* LINT.

* WIP.

* Split PerWorkerEpsilonGreedy out of EpsilonGreedy.
Docstrings.

* Fix bug in sampler.py in case Policy has self.exploration = None

* Update rllib/agents/dqn/dqn.py

Co-Authored-By: Eric Liang <ekhliang@gmail.com>

* WIP.

* Update rllib/agents/trainer.py

Co-Authored-By: Eric Liang <ekhliang@gmail.com>

* WIP.

* Change requests.

* LINT

* In tune/utils/util.py::deep_update() Only keep deep_updat'ing if both original and value are dicts. If value is not a dict, set

* Completely obsolete syn_replay_optimizer.py's parameters schedule_max_timesteps AND beta_annealing_fraction (replaced with prioritized_replay_beta_annealing_timesteps).

* Update rllib/evaluation/worker_set.py

Co-Authored-By: Eric Liang <ekhliang@gmail.com>

* Review fixes.

* Fix default value for DQN's exploration spec.

* LINT

* Fix recursion bug (wrong parent c'tor).

* Do not pass timestep to get_exploration_info.

* Update tf_policy.py

* Fix some remaining issues with test cases and remove more deprecated DQN/APEX exploration configs.

* Bug fix tf-action-dist

* DDPG incompatibility bug fix with new DQN exploration handling (which is imported by DDPG).

* Switch off exploration when getting action probs from off-policy-estimator's policy.

* LINT

* Fix test_checkpoint_restore.py.

* Deprecate all SAC exploration (unused) configs.

* Properly use `model.last_output()` everywhere. Instead of `model._last_output`.

* WIP.

* Take out set_epsilon from multi-agent-env test (not needed, decays anyway).

* WIP.

* Trigger re-test (flaky checkpoint-restore test).

* WIP.

* WIP.

* Add test case for deterministic action sampling in PPO.

* bug fix.

* Added deterministic test cases for different Agents.

* Fix problem with TupleActions in dynamic-tf-policy.

* Separate supported_spaces tests so they can be run separately for easier debugging.

* LINT.

* Fix autoregressive_action_dist.py test case.

* Re-test.

* Fix.

* Remove duplicate py_test rule from bazel.

* LINT.

* WIP.

* WIP.

* SAC fix.

* SAC fix.

* WIP.

* WIP.

* WIP.

* FIX 2 examples tests.

* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* Fix.

* LINT.

* Renamed test file.

* WIP.

* Add unittest.main.

* Make action_dist_class mandatory.

* fix

* FIX.

* WIP.

* WIP.

* Fix.

* Fix.

* Fix explorations test case (contextlib cannot find its own nullcontext??).

* Force torch to be installed for QMIX.

* LINT.

* Fix determine_tests_to_run.py.

* Fix determine_tests_to_run.py.

* WIP

* Add Random exploration component to tests (fixed issue with "static-graph randomness" via py_function).

* Add Random exploration component to tests (fixed issue with "static-graph randomness" via py_function).

* Rename some stuff.

* Rename some stuff.

* WIP.

* WIP.

* Fix SAC.

* Fix SAC.

* Fix strange tf-error in ray core tests.

* Fix strange ray-core tf-error in test_memory_scheduling test case.

* Fix test_io.py.

* LINT.

* Update SAC yaml files' config.

Co-authored-by: Eric Liang <ekhliang@gmail.com>
2020-02-22 14:19:49 -08:00
Stephanie Wang
4c2de7be54
[core] Ref counting for returning object IDs created by a different process (#7221)
* Add regression tests

* Refactor, split RemoveSubmittedTaskReferences into submitted and finished paths

* Add nested return IDs to UpdateFinishedTaskRefs, rename WrapObjectIds

* Basic unit tests pass

* Fix unit test and add an out-of-order regression test

* Add stored_in_objects to ObjectReferenceCount, regression test now passes

* Add an Address to the ReferenceCounter so we can determine ownership

* Set the nested return IDs from the TaskManager

* Add another test

* Simplify

* Update src/ray/core_worker/reference_count_test.cc

Co-Authored-By: Edward Oakes <ed.nmi.oakes@gmail.com>

* Apply suggestions from code review

Co-Authored-By: Edward Oakes <ed.nmi.oakes@gmail.com>

* comments

* Add python test

Co-authored-by: Edward Oakes <ed.nmi.oakes@gmail.com>
2020-02-22 13:29:48 -08:00
Amog Kamsetty
1737a113be
[Parallel Iterators] Repartition functionality (#7163)
* repartition and tests

* blacklist lib/ files from import checks

* addressing comments and splitting up tests

* code readability

* adding explicit ref for parent iterator

* formatting
2020-02-21 13:20:18 -08:00
mehrdadn
c6f50ecc51
setpgrp fix (#7250) 2020-02-21 13:15:11 -08:00