Richard Liaw
a2d2275ee1
Revert "[RLlib + Tune] Add placement group support to RLlib. ( #14289 )" ( #14360 )
...
This reverts commit 6cd0cd3bd9
.
2021-02-25 14:27:35 -08:00
Sven Mika
6cd0cd3bd9
[RLlib + Tune] Add placement group support to RLlib. ( #14289 )
2021-02-25 16:01:31 +01:00
Sven Mika
8000258333
[RLlib] R2D2 Implementation. ( #13933 )
2021-02-25 12:18:11 +01:00
QuantumMecha
0c93bb77cb
[RLlib] Update Documentation for Curiosity's support of continuous actions ( #13784 )
...
Only (Multi)Discrete action spaces are supported so far according to https://github.com/ray-project/ray/blob/master/rllib/utils/exploration/curiosity.py
2021-02-02 13:10:09 +01:00
Sven Mika
9dd9f72111
[RLlib] Add more detailed Documentation on Model building API ( #13261 )
2021-01-09 12:38:29 +01:00
Michael Luo
67229bf350
[RLlib] SlateQ Documentation ( #13266 )
2021-01-09 11:21:51 +01:00
Sven Mika
391cdfae8c
[RLlib] Trajectory view API docs. ( #12718 )
2020-12-30 17:32:21 -08:00
Michael Luo
6e6c680f14
MBMPO Cartpole ( #11832 )
...
* MBMPO Cartpole Done
* Added doc
2020-11-12 10:30:41 -08:00
Eric Liang
9b8218aabd
[docs] Move all /latest links to /master ( #11897 )
...
* use master link
* remae
* revert non-ray
* more
* mre
2020-11-10 10:53:28 -08:00
Yutai Zhou
6999db93cb
Un-indent multiagent section ( #11310 )
...
* Un-indent multiagent section
MARL section used to be nested inside bandits, which we probably don't want. Maybe give it its own section instead?
2020-10-29 16:12:48 +01:00
huyz-git
64e3c9741a
Update rllib-algorithms.rst ( #11642 )
2020-10-28 15:07:10 -07:00
Sven Mika
f91c455527
[RLlib] Curiosity documentation. ( #11066 )
2020-09-29 09:39:22 +02:00
Sven Mika
805dad3bc4
[RLlib] SAC algo cleanup. ( #10825 )
2020-09-20 11:27:02 +02:00
Eric Liang
f7d5aa46a3
[hotfix] Fix table formatting ( #10687 )
2020-09-09 16:08:54 -07:00
Justin Terry
8a1caf6279
rename centralized critic to shared critic ( #10610 )
2020-09-09 15:49:32 -07:00
Sven Mika
4b278c36fc
[RLlib] Behavioral Cloning (from MARWIL). ( #10619 )
2020-09-09 17:33:21 +02:00
Michael Luo
8e613652af
[RLLib] MBMPO Fixes ( #10296 )
2020-09-09 09:34:34 +02:00
Simon Mo
5a38a76c83
[Doc] Use sphinx_book_theme ( #10379 )
2020-09-08 16:25:23 -07:00
Justin Terry
352718610d
Multi-agent Algorithm Documentation Updates ( #9722 )
2020-09-03 22:37:46 -07:00
Michael Luo
4e9888ce2f
[RLlib] Dreamer ( #10172 )
2020-08-26 13:24:05 +02:00
Matthew Strawbridge
7a5af7e744
Fix links to ddpg tuned examples ( #9713 )
2020-08-25 11:30:13 -07:00
Sven Mika
d14b501692
[RLlib] First attempt at cleaning up algo code in RLlib: PG. ( #10115 )
2020-08-20 17:05:57 +02:00
Sven Mika
fe0bdb23ff
[RLlib] Attention Net/Transformers docs improvement.
2020-08-17 13:07:17 -07:00
Justin Terry
0d67602051
Update rllib-algorithms.rst ( #9640 )
2020-07-24 19:35:28 -07:00
Sven Mika
78dfed2683
[RLlib] Issue 8384: QMIX doesn't learn anything. ( #9527 )
2020-07-17 12:14:34 +02:00
Michael Luo
851d02463b
[Doc] RLlib Algorithms Documentation: MAML + PyTorch MAML ( #9189 )
2020-07-03 11:05:15 -07:00
Sven Mika
a90cd0fcbb
[RLlib] Unity3d soccer benchmarks ( #8834 )
2020-06-11 14:29:57 +02:00
Chapman Siu
04cffb7e65
[docs] rllib-models.rst
- QMIX +parametric ( #8868 )
...
Updating docs to show that QMIX supports parametric action space, as per SMAC environments.
This is reflected in the code here: https://github.com/ray-project/ray/blob/master/rllib/agents/qmix/qmix_policy.py#L179 and consistent with QMIX being an extension of DQN
2020-06-09 21:56:16 -07:00
Sven Mika
2746fc0476
[RLlib] Auto-framework, retire use_pytorch
in favor of framework=...
( #8520 )
2020-05-27 16:19:13 +02:00
Eric Liang
9a83908c46
[rllib] Deprecate policy optimizers ( #8345 )
2020-05-21 10:16:18 -07:00
Sven Mika
b95e28faea
[RLlib] APEX_DDPG (PyTorch) test case and docs. ( #8288 )
...
APEX_DDPG (PyTorch) test case and docs.
2020-05-04 09:36:27 +02:00
Sven Mika
166bb5d690
[RLlib] IMPALA PyTorch ( #8287 )
...
This PR adds an IMPALA PyTorch implementation.
- adds compilation tests for LSTM and w/o LSTM.
- adds learning test for CartPole.
2020-05-03 13:44:25 +02:00
Sven Mika
499ad5fbe4
[RLlib] PyTorch version of APPO. ( #8120 )
...
- Translate all vtrace functionality to torch and added torch to the framework_iterator-loop in all existing vtrace test cases.
- Add learning test cases for APPO torch (both w/ and w/o v-trace).
- Add quick compilation tests for APPO (tf and torch, v-trace and no v-trace).
2020-04-23 09:11:12 +02:00
Sven Mika
d15609ba2a
[RLlib] PyTorch version of ARS (Augmented Random Search). ( #8106 )
...
This PR implements a PyTorch version of RLlib's ARS algorithm using RLlib's functional algo builder API. It also adds a regression test for ARS (torch) on CartPole.
2020-04-21 09:47:52 +02:00
Sven Mika
3812bfedda
[RLlib] PyTorch version of ES (Evolution Strategies). ( #8104 )
...
PyTorch version of Evolution Strategies (ES) Algo.
2020-04-20 21:47:28 +02:00
Sven Mika
d0fab84e4d
[RLlib] DDPG PyTorch version. ( #7953 )
...
The DDPG/TD3 algorithms currently do not have a PyTorch implementation. This PR adds PyTorch support for DDPG/TD3 to RLlib.
This PR:
- Depends on the re-factor PR for DDPG (Functional Algorithm API).
- Adds learning regression tests for the PyTorch version of DDPG and a DDPG (torch)
- Updates the documentation to reflect that DDPG and TD3 now support PyTorch.
* Learning Pendulum-v0 on torch version (same config as tf). Wall time a little slower (~20% than tf).
* Fix GPU target model problem.
2020-04-16 10:20:01 +02:00
Sven Mika
d2b5c171cb
[RLlib] Add pytorch sigils to toc and add links to algo overview table. ( #7950 )
...
* Add torch sigils to toc-tree for DQN/APEX.
* WIP.
2020-04-09 10:40:18 -07:00
Sven Mika
22ccc43670
[RLlib] DQN torch version. ( #7597 )
...
* Fix.
* Rollback.
* WIP.
* WIP.
* WIP.
* WIP.
* WIP.
* WIP.
* WIP.
* WIP.
* Fix.
* Fix.
* Fix.
* Fix.
* Fix.
* WIP.
* WIP.
* Fix.
* Test case fixes.
* Test case fixes and LINT.
* Test case fixes and LINT.
* Rollback.
* WIP.
* WIP.
* Test case fixes.
* Fix.
* Fix.
* Fix.
* Add regression test for DQN w/ param noise.
* Fixes and LINT.
* Fixes and LINT.
* Fixes and LINT.
* Fixes and LINT.
* Fixes and LINT.
* Comment
* Regression test case.
* WIP.
* WIP.
* LINT.
* LINT.
* WIP.
* Fix.
* Fix.
* Fix.
* LINT.
* Fix (SAC does currently not support eager).
* Fix.
* WIP.
* LINT.
* Update rllib/evaluation/sampler.py
Co-Authored-By: Eric Liang <ekhliang@gmail.com>
* Update rllib/evaluation/sampler.py
Co-Authored-By: Eric Liang <ekhliang@gmail.com>
* Update rllib/utils/exploration/exploration.py
Co-Authored-By: Eric Liang <ekhliang@gmail.com>
* Update rllib/utils/exploration/exploration.py
Co-Authored-By: Eric Liang <ekhliang@gmail.com>
* WIP.
* WIP.
* Fix.
* LINT.
* LINT.
* Fix and LINT.
* WIP.
* WIP.
* WIP.
* WIP.
* Fix.
* LINT.
* Fix.
* Fix and LINT.
* Update rllib/utils/exploration/exploration.py
* Update rllib/policy/dynamic_tf_policy.py
Co-Authored-By: Eric Liang <ekhliang@gmail.com>
* Update rllib/policy/dynamic_tf_policy.py
Co-Authored-By: Eric Liang <ekhliang@gmail.com>
* Update rllib/policy/dynamic_tf_policy.py
Co-Authored-By: Eric Liang <ekhliang@gmail.com>
* Fixes.
* WIP.
* LINT.
* Fixes and LINT.
* LINT and fixes.
* LINT.
* Move action_dist back into torch extra_action_out_fn and LINT.
* Working SimpleQ learning cartpole on both torch AND tf.
* Working Rainbow learning cartpole on tf.
* Working Rainbow learning cartpole on tf.
* WIP.
* LINT.
* LINT.
* Update docs and add torch to APEX test.
* LINT.
* Fix.
* LINT.
* Fix.
* Fix.
* Fix and docstrings.
* Fix broken RLlib tests in master.
* Split BAZEL learning tests into cartpole and pendulum (reached the 60min barrier).
* Fix error_outputs option in BAZEL for RLlib regression tests.
* Fix.
* Tune param-noise tests.
* LINT.
* Fix.
* Fix.
* test
* test
* test
* Fix.
* Fix.
* WIP.
* WIP.
* WIP.
* WIP.
* LINT.
* WIP.
Co-authored-by: Eric Liang <ekhliang@gmail.com>
2020-04-06 11:56:16 -07:00
Eric Liang
5cebee68d6
[rllib] Add scaling guide to documentation, improve bandit docs ( #7780 )
...
* update
* reword
* update
* ms
* multi node sgd
* reorder
* improve bandit docs
* contrib
* update
* ref
* improve refs
* fix build
* add pillow dep
* add pil
* update pil
* pillow
* remove false
2020-03-27 22:05:43 -07:00
Saurabh Gupta
6ddf84b019
Contextual Bandit algorithms (WIP) ( #7642 )
2020-03-26 13:41:16 -07:00
hubcity
3d0a8662b3
#7246 - Fixing broken links ( #7247 )
...
* #7246 - Fixing broken links
* Apply suggestions from code review
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
2020-03-25 21:46:13 -07:00
Eric Liang
dd70720578
[rllib] Rename sample_batch_size => rollout_fragment_length ( #7503 )
...
* bulk rename
* deprecation warn
* update doc
* update fig
* line length
* rename
* make pytest comptaible
* fix test
* fi sys
* rename
* wip
* fix more
* lint
* update svg
* comments
* lint
* fix use of batch steps
2020-03-14 12:05:04 -07:00
Eric Liang
026f6884b5
[rllib] Add Decentralized DDPPO trainer and documentation ( #7088 )
2020-02-10 15:28:27 -08:00
Sven Mika
0e3960893a
[RLlib] Add rainbow config hint to algo-documentation. ( #7052 )
2020-02-05 12:01:43 -08:00
Eric Liang
6bb30c9f1b
fix links ( #6883 )
2020-01-22 01:06:07 -08:00
Eric Liang
14016535a5
[rllib] Add TF and Torch icons to show which are available for each algo ( #6869 )
2020-01-20 15:22:21 -08:00
Sven Mika
7659cae3ba
[RLlib] Add PG torch regression test ( #6828 )
...
* Add PG torch regression test to tuned_examples/regression_tests dir.
* Rename cartpole-pg.yaml into cartpole-pg-tf.yaml
* cartpole-pg-tf.yaml: Change cartpole-pg name of tuned_example to cartpole-pg-tf.
2020-01-18 15:57:12 -08:00
Justin Terry
97bf79917c
[RLlib] Update MADDPG example repo to maintained fork ( #6831 )
2020-01-18 13:08:27 -08:00
Michael Luo
e5dded917c
SAC site changes ( #6759 )
2020-01-09 18:13:42 -08:00
Michael Luo
1cb335487e
SAC for Mujoco Environments ( #6642 )
2019-12-31 00:16:54 -08:00