Sven Mika
db058d0fb3
[RLlib] Rename metrics_smoothing_episodes
into metrics_num_episodes_for_smoothing
for clarity. ( #20983 )
2021-12-11 20:33:35 +01:00
Sven Mika
b4790900f5
[RLlib] Sub-class Trainer
(instead of build_trainer()
): All remaining classes; soft-deprecate build_trainer
. ( #20725 )
2021-12-04 22:05:26 +01:00
Sven Mika
0de41e4a6b
[RLlib] Trainer sub-class QMIX/MAML/MB-MPO (instead of build_trainer
). ( #20639 )
2021-12-02 13:17:10 +01:00
Sven Mika
f82880eda1
Revert "Revert [RLlib] POC: Deprecate build_policy
(policy template) for torch only; PPOTorchPolicy ( #20061 ) ( #20399 )" ( #20417 )
...
This reverts commit 90dc5460d4
.
2021-11-16 14:49:41 +01:00
Amog Kamsetty
90dc5460d4
Revert "[RLlib] POC: Deprecate build_policy
(policy template) for torch only; PPOTorchPolicy ( #20061 )" ( #20399 )
...
This reverts commit 5b1c8e46e1
.
2021-11-15 16:11:35 -08:00
Sven Mika
5b1c8e46e1
[RLlib] POC: Deprecate build_policy
(policy template) for torch only; PPOTorchPolicy ( #20061 )
2021-11-15 10:41:54 +01:00
Sven Mika
cf21c634a3
[RLlib] Fix deprecated warning for torch_ops.py (soft-replaced by torch_utils.py). ( #19982 )
2021-11-03 10:00:46 +01:00
gjoliver
99a0088233
[RLlib] Unify the way we create local replay buffer for all agents ( #19627 )
...
* [RLlib] Unify the way we create and use LocalReplayBuffer for all the agents.
This change
1. Get rid of the try...except clause when we call execution_plan(),
and get rid of the Deprecation warning as a result.
2. Fix the execution_plan() call in Trainer._try_recover() too.
3. Most importantly, makes it much easier to create and use different types
of local replay buffers for all our agents.
E.g., allow us to easily create a reservoir sampling replay buffer for
APPO agent for Riot in the near future.
* Introduce explicit configuration for replay buffer types.
* Fix is_training key error.
* actually deprecate buffer_size field.
2021-10-26 20:56:02 +02:00
Sven Mika
b213565783
[RLlib] Fix failing test cases: Soft-deprecate ModelV2.from_batch (in favor of ModelV2.__call__). ( #19693 )
2021-10-25 15:00:00 +02:00
gjoliver
9226f9bddc
[RLlib] Report timesteps_this_iter to Tune, so it can track/checkpoint/restore total timesteps trained. ( #19264 )
...
* Report timesteps_this_iter to Tune, so it can track/checkpoint/restore
total timesteps trained.
* Trigger Build
* lint
2021-10-12 16:03:41 +02:00
Sven Mika
ed85f59194
[RLlib] Unify all RLlib Trainer.train() -> results[info][learner][policy ID][learner_stats] and add structure tests. ( #18879 )
2021-09-30 16:39:05 +02:00
Sven Mika
839fc59224
[RLlib] CQL TensorFlow support ( #15841 )
2021-05-18 11:10:46 +02:00
Sven Mika
732197e23a
[RLlib] Multi-GPU for tf-DQN/PG/A2C. ( #13393 )
2021-03-08 15:41:27 +01:00
Sven Mika
8000258333
[RLlib] R2D2 Implementation. ( #13933 )
2021-02-25 12:18:11 +01:00
Chace Ashcraft
ebeee1d59a
[RLlib] Pytorch MAML fix for more than two workers with discrete actions ( #13835 )
2021-02-08 12:06:02 +01:00
Sven Mika
9423930bcc
[RLlib] MAML: Add cartpole mass test for PyTorch. ( #13679 )
2021-01-25 12:32:41 +01:00
Sven Mika
2e3655e8a9
[RLlib] Issue 9071 A3C w/ RNN not working due to VF assuming no RNN. ( #13238 )
2021-01-19 14:22:36 +01:00
Sven Mika
93c0a5549b
[RLlib] Deprecate vf_share_layers
in top-level PPO/MAML/MB-MPO configs. ( #13397 )
2021-01-19 09:51:35 +01:00
Sven Mika
8726521604
[RLlib] JAXPolicy prep PR #2 (move get_activation_fn (backward-compatibly), minor fixes and preparations). ( #13091 )
2020-12-30 22:30:52 -05:00
Michael Luo
eae7a1f433
[RLLib] Readme.md Documentation for Almost All Algorithms in rllib/agents ( #13035 )
2020-12-29 18:45:55 -05:00
Sven Mika
99ae7bae05
[RLlib] JAXPolicy prep. PR #1 . ( #13077 )
2020-12-26 20:14:18 -05:00
Michael Luo
59bc1e6c09
[RLLib] MAML extension for all models except RNNs ( #11337 )
2020-11-12 16:51:40 -08:00
Sven Mika
62c7ab5182
[RLlib] Trajectory view API: Enable by default for PPO, IMPALA, PG, A3C (tf and torch). ( #11747 )
2020-11-12 16:27:34 +01:00
Sven Mika
414041c6dd
[RLlib] Do not create env on driver iff num_workers > 0. ( #11307 )
2020-10-15 18:21:30 +02:00
Sven Mika
ce96b03b07
[RLlib] MB-MPO cleanup (comments, docstrings, type annotations). ( #11033 )
2020-10-06 20:28:16 +02:00
Michael Luo
47b499d899
Cartpole MAML + Discrete ( #11028 )
2020-10-02 12:56:34 +02:00
Michael Luo
8e613652af
[RLLib] MBMPO Fixes ( #10296 )
2020-09-09 09:34:34 +02:00
Sven Mika
ef18893fb5
[RLlib] PPO, APPO, and DD-PPO code cleanup. ( #10420 )
2020-09-02 14:03:01 +02:00
Michael Luo
4d7bd8c892
[RLlib] Implementation of "Model-based Meta Policy Optimization" (MB MPO) ( #9409 )
2020-08-02 18:12:09 +02:00
Michael Luo
94fcd43593
[rllib] MAML Transform ( #9463 )
...
* MAML Transform
* Moved Inner Adapt to Method in Execution Plan
2020-07-16 11:11:33 -07:00
Michael Luo
851d02463b
[Doc] RLlib Algorithms Documentation: MAML + PyTorch MAML ( #9189 )
2020-07-03 11:05:15 -07:00
Sven Mika
b4c0b942fe
[RLlib] Remove requirement for dataclasses in rllib (not supported in py3.5) ( #9237 )
2020-07-01 17:31:44 +02:00
Sven Mika
43043ee4d5
[RLlib] Tf2x preparation; part 2 (upgrading try_import_tf()
). ( #9136 )
...
* WIP.
* Fixes.
* LINT.
* WIP.
* WIP.
* Fixes.
* Fixes.
* Fixes.
* Fixes.
* WIP.
* Fixes.
* Test
* Fix.
* Fixes and LINT.
* Fixes and LINT.
* LINT.
2020-06-30 10:13:20 +02:00
Michael Luo
cf0894d396
[rllib] MAML Agent ( #8862 )
...
* Halfway done with transferring MAML to new Ray
* MAML Beta Out
* Debugging MAML atm
* Distributed Execution
* Pendulum Mass Working
* All experiments complete
* Cleaned up codebase
* Travis CI
* Travis CI
* Tests
* Merged conflicts
* Fixed variance bug conflict
* Comment resolved
* Apply suggestions from code review
fixed test_maml
* Update rllib/agents/maml/tests/test_maml.py
* asdf
* Fix testing
Co-authored-by: Sven Mika <sven@anyscale.io>
2020-06-23 09:48:23 -07:00