Sven Mika
|
e74947cc94
|
[RLlib] Env directory cleanup and tests. (#13082)
|
2021-01-19 10:09:39 +01:00 |
|
Sven Mika
|
56878221ed
|
[RLlib] Redo: Make TFModelV2 fully modular like TorchModelV2 (soft-deprecate register_variables, unify var names wrt torch). (#13363)
|
2021-01-14 14:44:33 +01:00 |
|
Kai Fricke
|
25f10a947a
|
Revert "[RLlib] Make TFModelV2 behave more like TorchModelV2: Obsolete register_variables. Unify variable dicts. (#13339)" (#13361)
This reverts commit e2b2abb88b .
|
2021-01-12 12:33:57 +01:00 |
|
Sven Mika
|
e2b2abb88b
|
[RLlib] Make TFModelV2 behave more like TorchModelV2: Obsolete register_variables. Unify variable dicts. (#13339)
|
2021-01-11 22:42:30 +01:00 |
|
Sven Mika
|
6f342a2221
|
[RLlib] Preparatory PR for: Documentation on Model Building. (#13260)
|
2021-01-08 10:56:09 +01:00 |
|
Sven Mika
|
28ac4243f4
|
[RLlib] Deflake test case: 2-step game MADDPG. (#13121)
|
2020-12-30 18:37:37 -05:00 |
|
Sven Mika
|
d811d65920
|
[RLlib] run_regression_tests.py: --framework flag (instead of --torch). (#13097)
|
2020-12-29 15:27:59 -05:00 |
|
Sven Mika
|
a5318961de
|
[RLlib] Preprocessor fixes (multi-discrete) and tests. (#13083)
|
2020-12-26 20:14:36 -05:00 |
|
Sven Mika
|
d5604eaba3
|
[RLlib] Attention nets PyTorch support and cleanup (using traj. view API). (#12029)
|
2020-12-21 18:38:34 -08:00 |
|
Sven Mika
|
b2bcab711d
|
[RLlib] Attention Nets: tf (#12753)
|
2020-12-20 20:22:32 -05:00 |
|
Sven Mika
|
124c8318a8
|
[RLlib] Fix broken test_distributions.py (test_categorical) (#12915)
|
2020-12-17 17:44:26 -06:00 |
|
Edward Oakes
|
aedcf0c9d9
|
Disable test_distributions (#12919)
|
2020-12-16 14:17:49 -08:00 |
|
Sven Mika
|
deb33bce84
|
[RLlib] Add DQN SoftQ learning test case. (#12712)
|
2020-12-10 14:55:19 +01:00 |
|
Sven Mika
|
e40b14d255
|
[RLlib] Batch-size for truncate_episode batch_mode should be confgurable in agent-steps (rather than env-steps), if needed. (#12420)
|
2020-12-08 16:41:45 -08:00 |
|
Sven Mika
|
19c8033df2
|
[RLlib] Fix most remaining RLlib algos for running with trajectory view API. (#12366)
* WIP.
* WIP.
* WIP.
* WIP.
* WIP.
* WIP.
* WIP.
* WIP.
* WIP.
* WIP.
* WIP.
* WIP.
* WIP.
* WIP.
* WIP.
* WIP.
* WIP.
* WIP.
* WIP.
* WIP.
* WIP.
* WIP.
* WIP.
* WIP.
* WIP.
* WIP.
* LINT and fixes.
MB-MPO and MAML not working yet.
* wip
* update
* update
* rmeove
* remove dep
* higher
* Update requirements_rllib.txt
* Update requirements_rllib.txt
* relpos
* no mbmpo
Co-authored-by: Eric Liang <ekhliang@gmail.com>
|
2020-12-01 17:41:10 -08:00 |
|
Sven Mika
|
bb03e2499b
|
[RLlib] PyBullet Env native support via env str-specifier (if installed). (#12209)
|
2020-11-30 12:41:24 +01:00 |
|
Sven Mika
|
592c161032
|
[RLlib] Issue 12118: LSTM prev-a/r should be separately configurable. Fix missing prev-a one-hot encoding. (#12397)
* WIP.
* Fix and LINT.
|
2020-11-25 11:27:46 -08:00 |
|
Sven Mika
|
841d93d366
|
[RLlib] Issue 12233 shared tf layers example not really shared (only works for tf1.x, not tf2.x). (#12399)
|
2020-11-25 11:27:19 -08:00 |
|
Raoul Khouri
|
d07ffc152b
|
[rllib] Rrk/12079 custom filters (#12095)
* travis reformatted
|
2020-11-19 13:20:20 -08:00 |
|
Sven Mika
|
dab241dcc6
|
[RLlib] Fix inconsistency wrt batch size in SampleCollector (traj. view API). Makes DD-PPO work with traj. view API. (#12063)
|
2020-11-19 19:01:14 +01:00 |
|
Michael Luo
|
59bc1e6c09
|
[RLLib] MAML extension for all models except RNNs (#11337)
|
2020-11-12 16:51:40 -08:00 |
|
Sven Mika
|
54d85a6c2a
|
[RLlib] Fix RNN learning for tf-eager/tf2.x. (#11720)
|
2020-11-02 11:18:41 +01:00 |
|
Sven Mika
|
a6a94d3206
|
[RLlib] Fix test_env_with_subprocess.py. (#11356)
|
2020-10-13 12:42:20 -07:00 |
|
Sven Mika
|
8ea1bc5ff9
|
[RLlib] Allow for more than 2^31 policy timesteps. (#11301)
|
2020-10-12 13:49:11 -07:00 |
|
Sven Mika
|
d3bc20b727
|
[RLlib] ConvTranspose2D module (#11231)
|
2020-10-12 15:00:42 +02:00 |
|
desktable
|
f9621ce23c
|
[RLlib] Add recsim_wrapper unit test to BUILD (#11225)
|
2020-10-08 08:23:27 +02:00 |
|
Sven Mika
|
ce96b03b07
|
[RLlib] MB-MPO cleanup (comments, docstrings, type annotations). (#11033)
|
2020-10-06 20:28:16 +02:00 |
|
Sven Mika
|
4b278c36fc
|
[RLlib] Behavioral Cloning (from MARWIL). (#10619)
|
2020-09-09 17:33:21 +02:00 |
|
Sven Mika
|
28ab797cf5
|
[RLlib] Deprecate old classes, methods, functions, config keys (in prep for RLlib 1.0). (#10544)
|
2020-09-06 10:58:00 +02:00 |
|
Sven Mika
|
e968b52cb7
|
[RLlib] Trajectory view API - 03 Fast LSTM + prev actions/rewards (#9950)
|
2020-08-21 12:35:16 +02:00 |
|
Sven Mika
|
2cbe29a7fa
|
[RLlib] Curiosity minor fixes, do-overs, and testing. (#10143)
|
2020-08-19 17:49:50 +02:00 |
|
Sven Mika
|
66d204e078
|
[RLlib] Model documentation enhancements. (#10011)
|
2020-08-13 13:36:40 +02:00 |
|
Sven Mika
|
4b10bdf8fc
|
[RLlib] rollout.py - Add multi-agent test case. (#9981)
|
2020-08-10 19:44:23 +02:00 |
|
Sven Mika
|
57690a3a9f
|
[RLlib] Trajectory view API - 02 actual API scaffold (#9753)
|
2020-08-06 10:54:20 +02:00 |
|
Sven Mika
|
19d785b947
|
[LINT] Except RLlib from checking for flake8 error F821 (#9946)
|
2020-08-06 10:44:37 +02:00 |
|
Sven Mika
|
e540e425e4
|
[RLlib] rllib rollout test and bug fixes. (#9779)
|
2020-07-30 16:17:03 +02:00 |
|
Sven Mika
|
5dc4b6686e
|
[RLlib] Implement DQN PyTorch distributional head. (#9589)
|
2020-07-25 09:29:24 +02:00 |
|
Sven Mika
|
887cf5eca7
|
MADDPG learning confirmation test. (#9538)
|
2020-07-17 20:18:02 +02:00 |
|
Sven Mika
|
78dfed2683
|
[RLlib] Issue 8384: QMIX doesn't learn anything. (#9527)
|
2020-07-17 12:14:34 +02:00 |
|
Sven Mika
|
617eb8f279
|
[RLlib] Issue 9402 MARWIL producing nan rewards. (#9429)
|
2020-07-14 05:07:16 +02:00 |
|
Sven Mika
|
fcdf410ae1
|
[RLlib] Tf2.x native. (#8752)
|
2020-07-11 22:06:35 +02:00 |
|
Sven Mika
|
01125b8fcf
|
[RLlib] DQN rainbow eager-mode (keras style NoisyLayer) (preparation for native tf2.x support). (#9304)
|
2020-07-09 10:44:10 +02:00 |
|
Sven Mika
|
4da0e542d5
|
[RLlib] DDPG and SAC eager support (preparation for tf2.x) (#9204)
|
2020-07-08 16:12:20 +02:00 |
|
Benjamin Black
|
1425cdf834
|
Pettingzoo environment support (#9271)
* added pettingzoo wrapper env and example
* added docs, examples for pettingzoo env support
* fixed pettingzoo env flake8, added test
* fixed pettingzoo env import
* fixed pettingzoo env import
* fixed pettingzoo import issue
* fixed pettingzoo test
* fixed linting problem
* fixed bad quotes
* future proofed pettingzoo dependency
* fixed ray init in pettingzoo env
* lint
* manual lint
Co-authored-by: Eric Liang <ekhliang@gmail.com>
|
2020-07-06 21:32:26 -07:00 |
|
Sven Mika
|
f43d934817
|
[RLlib] Type annotations for policy. (#9248)
|
2020-07-05 13:09:51 +02:00 |
|
Sven Mika
|
5b2a97597b
|
[RLlib] Retire try_import_tree (should be installed along with other requirements). (#9211)
- Retire try_import_tree.
- Stabilize test_supported_multi_agent.py.
|
2020-07-02 13:06:34 +02:00 |
|
Sven Mika
|
43043ee4d5
|
[RLlib] Tf2x preparation; part 2 (upgrading try_import_tf() ). (#9136)
* WIP.
* Fixes.
* LINT.
* WIP.
* WIP.
* Fixes.
* Fixes.
* Fixes.
* Fixes.
* WIP.
* Fixes.
* Test
* Fix.
* Fixes and LINT.
* Fixes and LINT.
* LINT.
|
2020-06-30 10:13:20 +02:00 |
|
Sven Mika
|
5c6d5d4ab1
|
This PR fixes the currently broken lstm_use_prev_action_reward flag for default lstm models (model.use_lstm=True). (#8970)
|
2020-06-27 20:50:01 +02:00 |
|
Sven Mika
|
4fd8977eaf
|
[RLlib] Minor cleanup in preparation to tf2.x support. (#9130)
* WIP.
* Fixes.
* LINT.
* Fixes.
* Fixes and LINT.
* WIP.
|
2020-06-25 19:01:32 +02:00 |
|
Sven Mika
|
aa231799ed
|
Dyna test: small -> medium. (#9118)
|
2020-06-24 12:02:44 +02:00 |
|