ray/rllib/optimizers
Sven Mika 22ccc43670
[RLlib] DQN torch version. (#7597)
* Fix.

* Rollback.

* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* Fix.

* Fix.

* Fix.

* Fix.

* Fix.

* WIP.

* WIP.

* Fix.

* Test case fixes.

* Test case fixes and LINT.

* Test case fixes and LINT.

* Rollback.

* WIP.

* WIP.

* Test case fixes.

* Fix.

* Fix.

* Fix.

* Add regression test for DQN w/ param noise.

* Fixes and LINT.

* Fixes and LINT.

* Fixes and LINT.

* Fixes and LINT.

* Fixes and LINT.

* Comment

* Regression test case.

* WIP.

* WIP.

* LINT.

* LINT.

* WIP.

* Fix.

* Fix.

* Fix.

* LINT.

* Fix (SAC does currently not support eager).

* Fix.

* WIP.

* LINT.

* Update rllib/evaluation/sampler.py

Co-Authored-By: Eric Liang <ekhliang@gmail.com>

* Update rllib/evaluation/sampler.py

Co-Authored-By: Eric Liang <ekhliang@gmail.com>

* Update rllib/utils/exploration/exploration.py

Co-Authored-By: Eric Liang <ekhliang@gmail.com>

* Update rllib/utils/exploration/exploration.py

Co-Authored-By: Eric Liang <ekhliang@gmail.com>

* WIP.

* WIP.

* Fix.

* LINT.

* LINT.

* Fix and LINT.

* WIP.

* WIP.

* WIP.

* WIP.

* Fix.

* LINT.

* Fix.

* Fix and LINT.

* Update rllib/utils/exploration/exploration.py

* Update rllib/policy/dynamic_tf_policy.py

Co-Authored-By: Eric Liang <ekhliang@gmail.com>

* Update rllib/policy/dynamic_tf_policy.py

Co-Authored-By: Eric Liang <ekhliang@gmail.com>

* Update rllib/policy/dynamic_tf_policy.py

Co-Authored-By: Eric Liang <ekhliang@gmail.com>

* Fixes.

* WIP.

* LINT.

* Fixes and LINT.

* LINT and fixes.

* LINT.

* Move action_dist back into torch extra_action_out_fn and LINT.

* Working SimpleQ learning cartpole on both torch AND tf.

* Working Rainbow learning cartpole on tf.

* Working Rainbow learning cartpole on tf.

* WIP.

* LINT.

* LINT.

* Update docs and add torch to APEX test.

* LINT.

* Fix.

* LINT.

* Fix.

* Fix.

* Fix and docstrings.

* Fix broken RLlib tests in master.

* Split BAZEL learning tests into cartpole and pendulum (reached the 60min barrier).

* Fix error_outputs option in BAZEL for RLlib regression tests.

* Fix.

* Tune param-noise tests.

* LINT.

* Fix.

* Fix.

* test

* test

* test

* Fix.

* Fix.

* WIP.

* WIP.

* WIP.

* WIP.

* LINT.

* WIP.

Co-authored-by: Eric Liang <ekhliang@gmail.com>
2020-04-06 11:56:16 -07:00
..
tests [rllib] Rename sample_batch_size => rollout_fragment_length (#7503) 2020-03-14 12:05:04 -07:00
__init__.py [rllib] [experimental] Decentralized Distributed PPO for torch (DD-PPO) (#6918) 2020-01-25 22:36:43 -08:00
aso_aggregator.py [rllib] Rename sample_batch_size => rollout_fragment_length (#7503) 2020-03-14 12:05:04 -07:00
aso_learner.py Remove future imports (#6724) 2020-01-09 00:15:48 -08:00
aso_minibatch_buffer.py Remove future imports (#6724) 2020-01-09 00:15:48 -08:00
aso_multi_gpu_learner.py Remove future imports (#6724) 2020-01-09 00:15:48 -08:00
aso_tree_aggregator.py [rllib] Rename sample_batch_size => rollout_fragment_length (#7503) 2020-03-14 12:05:04 -07:00
async_gradients_optimizer.py Remove future imports (#6724) 2020-01-09 00:15:48 -08:00
async_replay_optimizer.py [rllib] Rename sample_batch_size => rollout_fragment_length (#7503) 2020-03-14 12:05:04 -07:00
async_samples_optimizer.py [rllib] Rename sample_batch_size => rollout_fragment_length (#7503) 2020-03-14 12:05:04 -07:00
microbatch_optimizer.py Remove future imports (#6724) 2020-01-09 00:15:48 -08:00
multi_gpu_impl.py [Core/RLlib] Move log_once from rllib to ray.util. (#7273) 2020-02-27 10:40:44 -08:00
multi_gpu_optimizer.py [RLlib] Bug fix: Copy is_exploring placeholder for multi-GPU tower generation. (#7846) 2020-04-03 10:44:58 -07:00
policy_optimizer.py Remove future imports (#6724) 2020-01-09 00:15:48 -08:00
replay_buffer.py [RLlib] Fix bugs and speed up SegmentTree 2020-03-13 01:03:07 -07:00
rollout.py [rllib] Rename sample_batch_size => rollout_fragment_length (#7503) 2020-03-14 12:05:04 -07:00
segment_tree.py [RLlib] Fix bugs and speed up SegmentTree 2020-03-13 01:03:07 -07:00
sync_batch_replay_optimizer.py Remove future imports (#6724) 2020-01-09 00:15:48 -08:00
sync_replay_optimizer.py [RLlib] DQN torch version. (#7597) 2020-04-06 11:56:16 -07:00
sync_samples_optimizer.py [rllib] [experimental] Decentralized Distributed PPO for torch (DD-PPO) (#6918) 2020-01-25 22:36:43 -08:00
torch_distributed_data_parallel_optimizer.py [rllib] Add Decentralized DDPPO trainer and documentation (#7088) 2020-02-10 15:28:27 -08:00