Sven Mika
22ccc43670
[RLlib] DQN torch version. ( #7597 )
...
* Fix.
* Rollback.
* WIP.
* WIP.
* WIP.
* WIP.
* WIP.
* WIP.
* WIP.
* WIP.
* Fix.
* Fix.
* Fix.
* Fix.
* Fix.
* WIP.
* WIP.
* Fix.
* Test case fixes.
* Test case fixes and LINT.
* Test case fixes and LINT.
* Rollback.
* WIP.
* WIP.
* Test case fixes.
* Fix.
* Fix.
* Fix.
* Add regression test for DQN w/ param noise.
* Fixes and LINT.
* Fixes and LINT.
* Fixes and LINT.
* Fixes and LINT.
* Fixes and LINT.
* Comment
* Regression test case.
* WIP.
* WIP.
* LINT.
* LINT.
* WIP.
* Fix.
* Fix.
* Fix.
* LINT.
* Fix (SAC does currently not support eager).
* Fix.
* WIP.
* LINT.
* Update rllib/evaluation/sampler.py
Co-Authored-By: Eric Liang <ekhliang@gmail.com>
* Update rllib/evaluation/sampler.py
Co-Authored-By: Eric Liang <ekhliang@gmail.com>
* Update rllib/utils/exploration/exploration.py
Co-Authored-By: Eric Liang <ekhliang@gmail.com>
* Update rllib/utils/exploration/exploration.py
Co-Authored-By: Eric Liang <ekhliang@gmail.com>
* WIP.
* WIP.
* Fix.
* LINT.
* LINT.
* Fix and LINT.
* WIP.
* WIP.
* WIP.
* WIP.
* Fix.
* LINT.
* Fix.
* Fix and LINT.
* Update rllib/utils/exploration/exploration.py
* Update rllib/policy/dynamic_tf_policy.py
Co-Authored-By: Eric Liang <ekhliang@gmail.com>
* Update rllib/policy/dynamic_tf_policy.py
Co-Authored-By: Eric Liang <ekhliang@gmail.com>
* Update rllib/policy/dynamic_tf_policy.py
Co-Authored-By: Eric Liang <ekhliang@gmail.com>
* Fixes.
* WIP.
* LINT.
* Fixes and LINT.
* LINT and fixes.
* LINT.
* Move action_dist back into torch extra_action_out_fn and LINT.
* Working SimpleQ learning cartpole on both torch AND tf.
* Working Rainbow learning cartpole on tf.
* Working Rainbow learning cartpole on tf.
* WIP.
* LINT.
* LINT.
* Update docs and add torch to APEX test.
* LINT.
* Fix.
* LINT.
* Fix.
* Fix.
* Fix and docstrings.
* Fix broken RLlib tests in master.
* Split BAZEL learning tests into cartpole and pendulum (reached the 60min barrier).
* Fix error_outputs option in BAZEL for RLlib regression tests.
* Fix.
* Tune param-noise tests.
* LINT.
* Fix.
* Fix.
* test
* test
* test
* Fix.
* Fix.
* WIP.
* WIP.
* WIP.
* WIP.
* LINT.
* WIP.
Co-authored-by: Eric Liang <ekhliang@gmail.com>
2020-04-06 11:56:16 -07:00
Sven Mika
82c2d9faba
[RLlib] Fix broken RLlib tests in master. ( #7894 )
2020-04-05 09:34:23 -07:00
Sven Mika
1d4823c0ec
[RLlib] Add testing framework_iterator. ( #7852 )
...
* Add testing framework_iterator.
* LINT.
* WIP.
* Fix and LINT.
* LINT fix.
2020-04-03 12:24:25 -07:00
Sven Mika
5537fe13b0
[RLlib] Exploration API: ParamNoise Integration into DQN; working example/test cases. ( #7814 )
2020-04-03 10:44:25 -07:00
Sven Mika
e153e3179f
[RLlib] Exploration API: Policy changes needed for forward pass noisifications. ( #7798 )
...
* Rollback.
* WIP.
* WIP.
* LINT.
* WIP.
* Fix.
* Fix.
* Fix.
* LINT.
* Fix (SAC does currently not support eager).
* Fix.
* WIP.
* LINT.
* Update rllib/evaluation/sampler.py
Co-Authored-By: Eric Liang <ekhliang@gmail.com>
* Update rllib/evaluation/sampler.py
Co-Authored-By: Eric Liang <ekhliang@gmail.com>
* Update rllib/utils/exploration/exploration.py
Co-Authored-By: Eric Liang <ekhliang@gmail.com>
* Update rllib/utils/exploration/exploration.py
Co-Authored-By: Eric Liang <ekhliang@gmail.com>
* WIP.
* WIP.
* Fix.
* LINT.
* LINT.
* Fix and LINT.
* WIP.
* WIP.
* WIP.
* WIP.
* Fix.
* LINT.
* Fix.
* Fix and LINT.
* Update rllib/utils/exploration/exploration.py
* Update rllib/policy/dynamic_tf_policy.py
Co-Authored-By: Eric Liang <ekhliang@gmail.com>
* Update rllib/policy/dynamic_tf_policy.py
Co-Authored-By: Eric Liang <ekhliang@gmail.com>
* Update rllib/policy/dynamic_tf_policy.py
Co-Authored-By: Eric Liang <ekhliang@gmail.com>
* Fixes.
* LINT.
* WIP.
Co-authored-by: Eric Liang <ekhliang@gmail.com>
2020-04-01 00:43:21 -07:00
Sven Mika
e4bd5db4d8
[RLlib] Minimal ParamNoise PR. ( #7772 )
2020-03-28 16:16:30 -07:00
Sven Mika
1138f2ebed
[RLlib] Issue 7046 cannot restore keras model from h5 file. ( #7482 )
2020-03-23 12:19:30 -07:00
mehrdadn
a0700e2f86
Change /tmp to platform-specific temporary directory ( #7529 )
2020-03-16 18:10:14 -07:00
Eric Liang
dd70720578
[rllib] Rename sample_batch_size => rollout_fragment_length ( #7503 )
...
* bulk rename
* deprecation warn
* update doc
* update fig
* line length
* rename
* make pytest comptaible
* fix test
* fi sys
* rename
* wip
* fix more
* lint
* update svg
* comments
* lint
* fix use of batch steps
2020-03-14 12:05:04 -07:00
Eric Liang
c3a8ba399f
[rllib] Enable distributed exec api for A2C, A3C, PG by default ( #7580 )
2020-03-13 18:48:41 -07:00
Sven Mika
552cfb37ea
[RLlib] Fix bugs and speed up SegmentTree
2020-03-13 01:03:07 -07:00
Sven Mika
f165766813
[RLlib] Bug: If trainer config horizon
is provided, should try to increase env steps to that value. ( #7531 )
2020-03-12 11:03:37 -07:00
Sven Mika
80d314ae5e
[RLlib] Add all agents to rllib rollout
tests. ( #7534 )
2020-03-12 11:02:51 -07:00
Eric Liang
f5d12a958b
[rllib] Port Ape-X to distributed execution API ( #7497 )
2020-03-12 00:54:08 -07:00
Sven Mika
20ef4a8603
[RLlib] Cleanup/unify all test cases. ( #7533 )
2020-03-11 20:39:47 -07:00
Eric Liang
a644060daa
[rllib] First pass at pipeline implementation of DQN ( #7433 )
...
* wip iters
* add test
* speed up
* update docs
* document it
* support serial sampling
* add test
* spacing
* annotate it
* update
* rename to pipeline
* comment
* iter2 wip
* update
* update
* context test
* update
* fix
* fix
* a3c pipeline
* doc
* update
* move timer
* comment
* add piepline test
* fix
* clean up
* document
* iter s
* wip dqn
* wip
* wip
* metrics
* metrics rename
* metrics ctx
* wip
* constants
* add todo
* suppport .union
* wip
* support union
* remove prints
* add todo
* remove auto timer
* fix up
* fix pipeline test
* typing
* fix breakage
* remove bad assert
* wip
* fix multiagent example
* fixapply
* update a3c
* remove a2c pl
* 0 workers
* wip
* wip
* share metrics
* wip
* wip
* doc
* fix weight sync and global var updates
* mode
* fix
* fix
* doc
* fix
2020-03-07 14:47:58 -08:00
Sven Mika
510c850651
[RLlib] SAC add discrete action support. ( #7320 )
...
* Exploration API (+EpsilonGreedy sub-class).
* Exploration API (+EpsilonGreedy sub-class).
* Cleanup/LINT.
* Add `deterministic` to generic Trainer config (NOTE: this is still ignored by most Agents).
* Add `error` option to deprecation_warning().
* WIP.
* Bug fix: Get exploration-info for tf framework.
Bug fix: Properly deprecate some DQN config keys.
* WIP.
* LINT.
* WIP.
* Split PerWorkerEpsilonGreedy out of EpsilonGreedy.
Docstrings.
* Fix bug in sampler.py in case Policy has self.exploration = None
* Update rllib/agents/dqn/dqn.py
Co-Authored-By: Eric Liang <ekhliang@gmail.com>
* WIP.
* Update rllib/agents/trainer.py
Co-Authored-By: Eric Liang <ekhliang@gmail.com>
* WIP.
* Change requests.
* LINT
* In tune/utils/util.py::deep_update() Only keep deep_updat'ing if both original and value are dicts. If value is not a dict, set
* Completely obsolete syn_replay_optimizer.py's parameters schedule_max_timesteps AND beta_annealing_fraction (replaced with prioritized_replay_beta_annealing_timesteps).
* Update rllib/evaluation/worker_set.py
Co-Authored-By: Eric Liang <ekhliang@gmail.com>
* Review fixes.
* Fix default value for DQN's exploration spec.
* LINT
* Fix recursion bug (wrong parent c'tor).
* Do not pass timestep to get_exploration_info.
* Update tf_policy.py
* Fix some remaining issues with test cases and remove more deprecated DQN/APEX exploration configs.
* Bug fix tf-action-dist
* DDPG incompatibility bug fix with new DQN exploration handling (which is imported by DDPG).
* Switch off exploration when getting action probs from off-policy-estimator's policy.
* LINT
* Fix test_checkpoint_restore.py.
* Deprecate all SAC exploration (unused) configs.
* Properly use `model.last_output()` everywhere. Instead of `model._last_output`.
* WIP.
* Take out set_epsilon from multi-agent-env test (not needed, decays anyway).
* WIP.
* Trigger re-test (flaky checkpoint-restore test).
* WIP.
* WIP.
* Add test case for deterministic action sampling in PPO.
* bug fix.
* Added deterministic test cases for different Agents.
* Fix problem with TupleActions in dynamic-tf-policy.
* Separate supported_spaces tests so they can be run separately for easier debugging.
* LINT.
* Fix autoregressive_action_dist.py test case.
* Re-test.
* Fix.
* Remove duplicate py_test rule from bazel.
* LINT.
* WIP.
* WIP.
* SAC fix.
* SAC fix.
* WIP.
* WIP.
* WIP.
* FIX 2 examples tests.
* WIP.
* WIP.
* WIP.
* WIP.
* WIP.
* Fix.
* LINT.
* Renamed test file.
* WIP.
* Add unittest.main.
* Make action_dist_class mandatory.
* fix
* FIX.
* WIP.
* WIP.
* Fix.
* Fix.
* Fix explorations test case (contextlib cannot find its own nullcontext??).
* Force torch to be installed for QMIX.
* LINT.
* Fix determine_tests_to_run.py.
* Fix determine_tests_to_run.py.
* WIP
* Add Random exploration component to tests (fixed issue with "static-graph randomness" via py_function).
* Add Random exploration component to tests (fixed issue with "static-graph randomness" via py_function).
* Rename some stuff.
* Rename some stuff.
* WIP.
* update.
* WIP.
* Gumbel Softmax Dist.
* WIP.
* WIP.
* WIP.
* WIP.
* WIP.
* WIP.
* WIP
* WIP.
* WIP.
* Hypertune.
* Hypertune.
* Hypertune.
* Lock-in.
* Cleanup.
* LINT.
* Fix.
* Update rllib/policy/eager_tf_policy.py
Co-Authored-By: Kristian Hartikainen <kristian.hartikainen@gmail.com>
* Update rllib/agents/sac/sac_policy.py
Co-Authored-By: Kristian Hartikainen <kristian.hartikainen@gmail.com>
* Update rllib/agents/sac/sac_policy.py
Co-Authored-By: Kristian Hartikainen <kristian.hartikainen@gmail.com>
* Update rllib/models/tf/tf_action_dist.py
Co-Authored-By: Kristian Hartikainen <kristian.hartikainen@gmail.com>
* Update rllib/models/tf/tf_action_dist.py
Co-Authored-By: Kristian Hartikainen <kristian.hartikainen@gmail.com>
* Fix items from review comments.
* Add dm_tree to RLlib dependencies.
* Add dm_tree to RLlib dependencies.
* Fix DQN test cases ((Torch)Categorical).
* Fix wrong pip install.
Co-authored-by: Eric Liang <ekhliang@gmail.com>
Co-authored-by: Kristian Hartikainen <kristian.hartikainen@gmail.com>
2020-03-06 10:37:12 -08:00
Eric Liang
0f88444686
[rllib] Support multi-agent training in pipeline impls, add easy flag to enable ( #7338 )
2020-03-02 15:16:37 -08:00
Sven Mika
d8eeb96413
Fix issue with torch PPO not handling action spaces of shape=(>1,). ( #7398 )
2020-03-02 10:53:19 -08:00
Sven Mika
83e06cd30a
[RLlib] DDPG refactor and Exploration API action noise classes. ( #7314 )
...
* WIP.
* WIP.
* WIP.
* WIP.
* WIP.
* Fix
* WIP.
* Add TD3 quick Pendulum regresison.
* Cleanup.
* Fix.
* LINT.
* Fix.
* Sort quick_learning test cases, add TD3.
* Sort quick_learning test cases, add TD3.
* Revert test_checkpoint_restore.py (debugging) changes.
* Fix old soft_q settings in documentation and test configs.
* More doc fixes.
* Fix test case.
* Fix test case.
* Lower test load.
* WIP.
2020-03-01 11:53:35 -08:00
Eric Liang
3c6b94f3f5
[rllib] Enable performance metrics reporting for RLlib pipelines, add A3C ( #7299 )
2020-02-28 16:44:17 -08:00
Sven Mika
aec03656d5
[RLlib] TupleActions cannot be exported by Policy: Fixes issues 7231 and 5593. #7333
2020-02-26 15:22:54 -08:00
Eric Liang
1660b52751
[rllib] Fix torch GPU / yaml load warning ( #7278 )
...
* fix
* safe load
* reduce num buffer shardscZZ
2020-02-23 13:13:43 -08:00
Sven Mika
0db2046b0a
[RLlib] Policy.compute_log_likelihoods() and SAC refactor. (issue #7107 ) ( #7124 )
...
* Exploration API (+EpsilonGreedy sub-class).
* Exploration API (+EpsilonGreedy sub-class).
* Cleanup/LINT.
* Add `deterministic` to generic Trainer config (NOTE: this is still ignored by most Agents).
* Add `error` option to deprecation_warning().
* WIP.
* Bug fix: Get exploration-info for tf framework.
Bug fix: Properly deprecate some DQN config keys.
* WIP.
* LINT.
* WIP.
* Split PerWorkerEpsilonGreedy out of EpsilonGreedy.
Docstrings.
* Fix bug in sampler.py in case Policy has self.exploration = None
* Update rllib/agents/dqn/dqn.py
Co-Authored-By: Eric Liang <ekhliang@gmail.com>
* WIP.
* Update rllib/agents/trainer.py
Co-Authored-By: Eric Liang <ekhliang@gmail.com>
* WIP.
* Change requests.
* LINT
* In tune/utils/util.py::deep_update() Only keep deep_updat'ing if both original and value are dicts. If value is not a dict, set
* Completely obsolete syn_replay_optimizer.py's parameters schedule_max_timesteps AND beta_annealing_fraction (replaced with prioritized_replay_beta_annealing_timesteps).
* Update rllib/evaluation/worker_set.py
Co-Authored-By: Eric Liang <ekhliang@gmail.com>
* Review fixes.
* Fix default value for DQN's exploration spec.
* LINT
* Fix recursion bug (wrong parent c'tor).
* Do not pass timestep to get_exploration_info.
* Update tf_policy.py
* Fix some remaining issues with test cases and remove more deprecated DQN/APEX exploration configs.
* Bug fix tf-action-dist
* DDPG incompatibility bug fix with new DQN exploration handling (which is imported by DDPG).
* Switch off exploration when getting action probs from off-policy-estimator's policy.
* LINT
* Fix test_checkpoint_restore.py.
* Deprecate all SAC exploration (unused) configs.
* Properly use `model.last_output()` everywhere. Instead of `model._last_output`.
* WIP.
* Take out set_epsilon from multi-agent-env test (not needed, decays anyway).
* WIP.
* Trigger re-test (flaky checkpoint-restore test).
* WIP.
* WIP.
* Add test case for deterministic action sampling in PPO.
* bug fix.
* Added deterministic test cases for different Agents.
* Fix problem with TupleActions in dynamic-tf-policy.
* Separate supported_spaces tests so they can be run separately for easier debugging.
* LINT.
* Fix autoregressive_action_dist.py test case.
* Re-test.
* Fix.
* Remove duplicate py_test rule from bazel.
* LINT.
* WIP.
* WIP.
* SAC fix.
* SAC fix.
* WIP.
* WIP.
* WIP.
* FIX 2 examples tests.
* WIP.
* WIP.
* WIP.
* WIP.
* WIP.
* Fix.
* LINT.
* Renamed test file.
* WIP.
* Add unittest.main.
* Make action_dist_class mandatory.
* fix
* FIX.
* WIP.
* WIP.
* Fix.
* Fix.
* Fix explorations test case (contextlib cannot find its own nullcontext??).
* Force torch to be installed for QMIX.
* LINT.
* Fix determine_tests_to_run.py.
* Fix determine_tests_to_run.py.
* WIP
* Add Random exploration component to tests (fixed issue with "static-graph randomness" via py_function).
* Add Random exploration component to tests (fixed issue with "static-graph randomness" via py_function).
* Rename some stuff.
* Rename some stuff.
* WIP.
* WIP.
* Fix SAC.
* Fix SAC.
* Fix strange tf-error in ray core tests.
* Fix strange ray-core tf-error in test_memory_scheduling test case.
* Fix test_io.py.
* LINT.
* Update SAC yaml files' config.
Co-authored-by: Eric Liang <ekhliang@gmail.com>
2020-02-22 14:19:49 -08:00
Sven Mika
6043ce710d
Fix old exploration configs. ( #7240 )
2020-02-20 08:39:16 -08:00
Sven Mika
d537e9f0d8
[RLlib] Exploration API: merge deterministic flag with exploration classes (SoftQ and StochasticSampling). ( #7155 )
2020-02-19 12:18:45 -08:00
Sven Mika
2e60f0d4d8
[RLlib] Move all jenkins RLlib-tests into bazel (rllib/BUILD). ( #7178 )
...
* commit
* comment
2020-02-15 14:50:44 -08:00
Sven Mika
6e1c3ea824
[RLlib] Exploration API (+EpsilonGreedy sub-class). ( #6974 )
2020-02-10 15:22:07 -08:00
Sven Mika
5ac5ac9560
[RLlib] Fix broken example: tf-eager with custom-RNN ( #6732 ). ( #7021 )
...
* WIP.
* Fix float32 conversion in OneHot preprocessor (would cause float64 in eager, then NN-matmul-failure).
Add proper seq-len + state-in construction in eager_tf_policy.py::_compute_gradients().
* LINT.
* eager_tf_policy.py: Only set samples["seq_lens"] if RNN. Otherwise, eager-tracing will throw flattened-dict key-mismatch error.
* Move issue code to examples folder.
Co-authored-by: Eric Liang <ekhliang@gmail.com>
2020-02-06 09:44:08 -08:00
roireshef
3c60caa448
[rllib] implemented compute_advantages without gae ( #6941 )
2020-01-31 22:25:45 -08:00
Sven Mika
c957ed58ed
[RLlib] Implement PPO torch version. ( #6826 )
2020-01-20 23:06:50 -08:00
Sven Mika
303547f119
[RLlib] Policy-classes cleanup and torch/tf unification. ( #6770 )
2020-01-17 22:26:28 -08:00
Sven
60d4d5e1aa
Remove future imports ( #6724 )
...
* Remove all __future__ imports from RLlib.
* Remove (object) again from tf_run_builder.py::TFRunBuilder.
* Fix 2xLINT warnings.
* Fix broken appo_policy import (must be appo_tf_policy)
* Remove future imports from all other ray files (not just RLlib).
* Remove future imports from all other ray files (not just RLlib).
* Remove future import blocks that contain `unicode_literals` as well.
Revert appo_tf_policy.py to appo_policy.py (belongs to another PR).
* Add two empty lines before Schedule class.
* Put back __future__ imports into determine_tests_to_run.py. Fails otherwise on a py2/print related error.
2020-01-09 00:15:48 -08:00
Robert Nishihara
39a3459886
Remove (object) from class declarations. ( #6658 )
2020-01-02 17:42:13 -08:00
Sven
f1b56fa5ee
PG unify/cleanup tf vs torch and PG functionality test cases (tf + torch). ( #6650 )
...
* Unifying the code for PGTrainer/Policy wrt tf vs torch.
Adding loss function test cases for the PGAgent (confirm equivalence of tf and torch).
* Fix LINT line-len errors.
* Fix LINT errors.
* Fix `tf_pg_policy` imports (formerly: `pg_policy`).
* Rename tf_pg_... into pg_tf_... following <alg>_<framework>_... convention, where ...=policy/loss/agent/trainer.
Retire `PGAgent` class (use PGTrainer instead).
* - Move PG test into agents/pg/tests directory.
- All test cases will be located near the classes that are tested and
then built into the Bazel/Travis test suite.
* Moved post_process_advantages into pg.py (from pg_tf_policy.py), b/c
the function is not a tf-specific one.
* Fix remaining import errors for agents/pg/...
* Fix circular dependency in pg imports.
* Add pg tests to Jenkins test suite.
2020-01-02 16:08:03 -08:00
Sven
8b16847c02
Get utils ready for better Agent torch support. ( #6561 )
2019-12-30 12:27:32 -08:00
Michael Luo
548df014ec
SAC Performance Fixes ( #6295 )
...
* SAC Performance Fixes
* Small Changes
* Update sac_model.py
* fix normalize wrapper
* Update test_eager_support.py
Co-authored-by: Eric Liang <ekhliang@gmail.com>
2019-12-20 10:51:25 -08:00
Eric Liang
be5dd8eb5e
Enable direct calls by default ( #6367 )
...
* wip
* add
* timeout fix
* const ref
* comments
* fix
* fix
* Move actor state into actor handle
* comments 2
* enable by default
* temp reorder
* some fixes
* add debug code
* tmp
* fix
* wip
* remove dbg
* fix compile
* fix
* fix check
* remove non direct tests
* Increment ref count before resolving value
* rename
* fix another bug
* tmp
* tmp
* Fix object pinning
* build change
* lint
* ActorManager
* tmp
* ActorManager
* fix test component failures
* Remove old code
* Remove unused
* fix
* fix
* fix resources
* fix advanced
* eric's diff
* blacklist
* blacklist
* cleanup
* annotate
* disable tests for now
* remove
* fix
* fix
* clean up verbosity
* fix test
* fix concurrency test
* Update .travis.yml
* Update .travis.yml
* Update .travis.yml
* split up analysis suite
* split up trial runner suite
* fix detached direct actors
* fix
* split up advanced tesT
* lint
* fix core worker test hang
* fix bad check fail which breaks test_cluster.py in tune
* fix some minor diffs in test_cluster
* less workers
* make less stressful
* split up test
* retry flaky tests
* remove old test flags
* fixes
* lint
* Update worker_pool.cc
* fix race
* fix
* fix bugs in node failure handling
* fix race condition
* fix bugs in node failure handling
* fix race condition
* nits
* fix test
* disable heartbeatS
* disable heartbeatS
* fix
* fix
* use worker id
* fix max fail
* debug exit
* fix merge, and apply [PATCH] fix concurrency test
* [patch] fix core worker test hang
* remove NotifyActorCreation, and return worker on completion of actor creation task
* remove actor diied callback
* Update core_worker.cc
* lint
* use task manager
* fix merge
* fix deadlock
* wip
* merge conflits
* fix
* better sysexit handling
* better sysexit handling
* better sysexit handling
* check id
* better debug
* task failed msg
* task failed msg
* retry failed tasks with delay
* retry failed tasks with delay
* clip deps
* fix
* fix core worker tests
* fix task manager test
* fix all tests
* cleanup
* set to 0 for direct tests
* dont check worker id for ownership rpc
* dont check worker id for ownership rpc
* debug messages
* add comment
* remove debug statements
* nit
* check worker id
* fix test
* owner
* fix tests
2019-12-13 13:58:04 -08:00
Stephanie Wang
da41180dc0
[direct task] Retry tasks on failure and turn on RAY_FORCE_DIRECT for test_multinode_failures.py ( #6306 )
...
* multinode failures direct
* Add number of retries allowed for tasks
* Retry tasks
* Add failing test for object reconstruction
* Handle return status and debug
* update
* Retry task unit test
* update
* update
* todo
* Fix max_retries decorator, fix test
* Fix test that flaked
* lint
* comments
2019-12-02 10:20:57 -08:00
Eric Liang
1f043daf69
[rllib] Fix and add test for LR annealing config
2019-11-07 12:17:27 -08:00
David Bignell
3f83b2daa9
[rllib] Rollout extensions ( #6065 )
...
* Rollout improvements
* Make info-saving optional, to avoid breaking change.
* Store generating ray version in checkpoint metadata
* Keep the linter happy
* Add small rollout test
* Terse.
* Update test_io.py
2019-11-05 20:34:18 -08:00
Eric Liang
2a0225dd25
[rllib] RLlib chooses wrong neural network model for Atari in 0.7.5 ( #6087 )
2019-11-05 11:36:29 -08:00
Eric Liang
16891e9379
[rllib] Don't use flat weights in non-eager mode ( #6001 )
2019-10-31 15:16:02 -07:00
Eric Liang
a0dcb45dc3
[rllib] Fix APEX priorities returning zero all the time ( #5980 )
...
* fix
* move example tests to end
* level err
* guard against none
* no trace test
* ignore thumbs
* np
* fix multi node
* fix
2019-10-26 13:23:42 -07:00
Eric Liang
f7bda0abad
[rllib] Fix rnn shape with multi-dimensional data ( #5939 )
...
* fix shape
* add test
* Update rnn_sequencing.py
2019-10-22 11:07:26 -07:00
Eric Liang
fb3b232c0e
[rllib] Properly flatten 2-d observations as input to FCnet ( #5733 )
2019-09-19 12:10:31 -07:00
gehring
8903bcd0c3
[rllib] Tracing for eager tensorflow policies with tf.function
( #5705 )
...
* Added tracing of eager policies with `tf.function`
* lint
* add config option
* add docs
* wip
* tracing now works with a3c
* typo
* none
* file doc
* returns
* syntax error
* syntax error
2019-09-17 01:44:20 -07:00
Richard Liaw
0010f54378
Update Cloudpickle ( #5643 )
2019-09-09 17:17:29 -07:00
Eric Liang
19bbf1eb4d
[rllib] Revert [rllib] Port DDPG to the build_tf_policy pattern ( #5626 )
2019-09-04 21:39:22 -07:00
Eric Liang
97ccd75952
[rllib] Enable object store memory limit by default ( #5534 )
2019-08-26 01:37:28 -07:00