Commit graph

340 commits

Author SHA1 Message Date
Sven Mika
52c94b7ee9
[RLlib] Allow SAC to use custom models as Q- or policy nets and deprecate "state-preprocessor" for image spaces. (#13522) 2021-02-02 13:05:58 +01:00
Sven Mika
9423930bcc
[RLlib] MAML: Add cartpole mass test for PyTorch. (#13679) 2021-01-25 12:32:41 +01:00
Sven Mika
e74947cc94
[RLlib] Env directory cleanup and tests. (#13082) 2021-01-19 10:09:39 +01:00
Sven Mika
93c0a5549b
[RLlib] Deprecate vf_share_layers in top-level PPO/MAML/MB-MPO configs. (#13397) 2021-01-19 09:51:35 +01:00
Sven Mika
d98235cc84
[RLlib] Deflake 2x remote & local inference tests (external env). (#13459) 2021-01-14 20:44:26 +01:00
Sven Mika
56878221ed
[RLlib] Redo: Make TFModelV2 fully modular like TorchModelV2 (soft-deprecate register_variables, unify var names wrt torch). (#13363) 2021-01-14 14:44:33 +01:00
Kai Fricke
25f10a947a
Revert "[RLlib] Make TFModelV2 behave more like TorchModelV2: Obsolete register_variables. Unify variable dicts. (#13339)" (#13361)
This reverts commit e2b2abb88b.
2021-01-12 12:33:57 +01:00
Sven Mika
e2b2abb88b
[RLlib] Make TFModelV2 behave more like TorchModelV2: Obsolete register_variables. Unify variable dicts. (#13339) 2021-01-11 22:42:30 +01:00
Sven Mika
9dd9f72111
[RLlib] Add more detailed Documentation on Model building API (#13261) 2021-01-09 12:38:29 +01:00
Sven Mika
6f342a2221
[RLlib] Preparatory PR for: Documentation on Model Building. (#13260) 2021-01-08 10:56:09 +01:00
Basu Jindal
4e569ee20b
Update multi_agent_independent_learning.py (#13196)
pettingzoo.utils.error.DeprecatedEnv: waterworld_v0 is now depreciated, use waterworld_v2 instead
2021-01-05 13:44:54 -08:00
Sven Mika
9eba1871bb
[RLlib] Support easy use_attention=True flag for using the GTrXL model. (#11698) 2021-01-01 14:06:23 -05:00
Sven Mika
391cdfae8c
[RLlib] Trajectory view API docs. (#12718) 2020-12-30 17:32:21 -08:00
Sven Mika
c524f86785
[RLlib] BC/MARWIL/recurrent nets minor cleanups and bug fixes. (#13064) 2020-12-27 09:46:03 -05:00
Sven Mika
99ae7bae05
[RLlib] JAXPolicy prep. PR #1. (#13077) 2020-12-26 20:14:18 -05:00
Sven Mika
670d083a56
[RLlib] Fix broken unity3d_env import in example server script. (#13040) 2020-12-23 11:29:58 -05:00
Sven Mika
d5604eaba3
[RLlib] Attention nets PyTorch support and cleanup (using traj. view API). (#12029) 2020-12-21 18:38:34 -08:00
Sven Mika
b2bcab711d
[RLlib] Attention Nets: tf (#12753) 2020-12-20 20:22:32 -05:00
Sven Mika
e40b14d255
[RLlib] Batch-size for truncate_episode batch_mode should be confgurable in agent-steps (rather than env-steps), if needed. (#12420) 2020-12-08 16:41:45 -08:00
Sven Mika
99c81c6795
[RLlib] Attention Net prep PR #3. (#12450) 2020-12-07 13:08:17 +01:00
Sven Mika
3ad9365e1d
[RLlib] Attention Net prep PR #2: Smaller cleanups. (#12449) 2020-12-01 08:21:45 +01:00
Sven Mika
0df55a139c
[RLlib] Attention Net prep PR #1: Smaller cleanups. (#12447)
* WIP.

* Fix.

* Fix.

* Fix.
2020-11-27 16:25:47 -08:00
Sven Mika
592c161032
[RLlib] Issue 12118: LSTM prev-a/r should be separately configurable. Fix missing prev-a one-hot encoding. (#12397)
* WIP.

* Fix and LINT.
2020-11-25 11:27:46 -08:00
Sven Mika
841d93d366
[RLlib] Issue 12233 shared tf layers example not really shared (only works for tf1.x, not tf2.x). (#12399) 2020-11-25 11:27:19 -08:00
Raoul Khouri
d07ffc152b
[rllib] Rrk/12079 custom filters (#12095)
* travis reformatted
2020-11-19 13:20:20 -08:00
Sven Mika
dab241dcc6
[RLlib] Fix inconsistency wrt batch size in SampleCollector (traj. view API). Makes DD-PPO work with traj. view API. (#12063) 2020-11-19 19:01:14 +01:00
Michael Luo
6e6c680f14
MBMPO Cartpole (#11832)
* MBMPO Cartpole Done

* Added doc
2020-11-12 10:30:41 -08:00
Sven Mika
62c7ab5182
[RLlib] Trajectory view API: Enable by default for PPO, IMPALA, PG, A3C (tf and torch). (#11747) 2020-11-12 16:27:34 +01:00
Benjamin Black
1999266bba
Updated pettingzoo env to acomidate api changes and fixes (#11873)
* Updated pettingzoo env to acomidate api changes and fixes

* fixed test failure

* fixed linting issue

* fixed test failure
2020-11-09 16:09:49 -08:00
Pierre TASSEL
66605cfcbd
[RLLib] Random Parametric Trainer (#11366) 2020-11-04 11:12:51 +01:00
desktable
5af745c90d
[RLlib] Implement the SlateQ algorithm (#11450) 2020-11-03 09:52:04 +01:00
Lara Codeca
e735add268
[RLlib] Integration with SUMO Simulator (#11710) 2020-11-03 09:45:03 +01:00
Sven Mika
54d85a6c2a
[RLlib] Fix RNN learning for tf-eager/tf2.x. (#11720) 2020-11-02 11:18:41 +01:00
Sven Mika
8ea1bc5ff9
[RLlib] Allow for more than 2^31 policy timesteps. (#11301) 2020-10-12 13:49:11 -07:00
Sven Mika
ce96b03b07
[RLlib] MB-MPO cleanup (comments, docstrings, type annotations). (#11033) 2020-10-06 20:28:16 +02:00
Sven Mika
c17169dc11
[RLlib] Fix all example scripts to run on GPUs. (#11105) 2020-10-02 23:07:44 +02:00
Sven Mika
36bda8432b
[RLlib] Trajectory view API: Simple List Collector (on by default for PPO); LSTM-agnostic (#11056) 2020-10-01 16:57:10 +02:00
Eric Liang
daa03ba6e6
[rllib] Add execution module to package ref (#10941)
* add init

* add

* update
2020-09-21 23:03:06 -07:00
Sven Mika
d7c42d6d92
[RLlib] Unity blogpost final fixes. (#10894) 2020-09-20 14:13:20 +02:00
Sven Mika
805dad3bc4
[RLlib] SAC algo cleanup. (#10825) 2020-09-20 11:27:02 +02:00
Benjamin Black
f2408b719c
Fixed PettingZooEnv (#10847) 2020-09-17 11:28:42 -07:00
Sven Mika
4b278c36fc
[RLlib] Behavioral Cloning (from MARWIL). (#10619) 2020-09-09 17:33:21 +02:00
Michael Luo
8e613652af
[RLLib] MBMPO Fixes (#10296) 2020-09-09 09:34:34 +02:00
Sven Mika
28ab797cf5
[RLlib] Deprecate old classes, methods, functions, config keys (in prep for RLlib 1.0). (#10544) 2020-09-06 10:58:00 +02:00
Richard Liaw
551c597312
[tune] API revamp fix (#10518) 2020-09-05 15:34:53 -07:00
Sven Mika
244aafdcf8
[RLlib] Curiosity enhancements. (#10373) 2020-09-05 13:14:24 +02:00
Justin Terry
352718610d
Multi-agent Algorithm Documentation Updates (#9722) 2020-09-03 22:37:46 -07:00
Sven Mika
715ee8dfc9
[RLlib] Issue 10469: Callbacks should receive env idx ... (#10477) 2020-09-03 17:27:05 +02:00
Sven Mika
ef18893fb5
[RLlib] PPO, APPO, and DD-PPO code cleanup. (#10420) 2020-09-02 14:03:01 +02:00
Michael Luo
4e9888ce2f
[RLlib] Dreamer (#10172) 2020-08-26 13:24:05 +02:00