Sven Mika
|
592c161032
|
[RLlib] Issue 12118: LSTM prev-a/r should be separately configurable. Fix missing prev-a one-hot encoding. (#12397)
* WIP.
* Fix and LINT.
|
2020-11-25 11:27:46 -08:00 |
|
Sven Mika
|
841d93d366
|
[RLlib] Issue 12233 shared tf layers example not really shared (only works for tf1.x, not tf2.x). (#12399)
|
2020-11-25 11:27:19 -08:00 |
|
Raoul Khouri
|
d07ffc152b
|
[rllib] Rrk/12079 custom filters (#12095)
* travis reformatted
|
2020-11-19 13:20:20 -08:00 |
|
Sven Mika
|
dab241dcc6
|
[RLlib] Fix inconsistency wrt batch size in SampleCollector (traj. view API). Makes DD-PPO work with traj. view API. (#12063)
|
2020-11-19 19:01:14 +01:00 |
|
Michael Luo
|
6e6c680f14
|
MBMPO Cartpole (#11832)
* MBMPO Cartpole Done
* Added doc
|
2020-11-12 10:30:41 -08:00 |
|
Sven Mika
|
62c7ab5182
|
[RLlib] Trajectory view API: Enable by default for PPO, IMPALA, PG, A3C (tf and torch). (#11747)
|
2020-11-12 16:27:34 +01:00 |
|
Benjamin Black
|
1999266bba
|
Updated pettingzoo env to acomidate api changes and fixes (#11873)
* Updated pettingzoo env to acomidate api changes and fixes
* fixed test failure
* fixed linting issue
* fixed test failure
|
2020-11-09 16:09:49 -08:00 |
|
Pierre TASSEL
|
66605cfcbd
|
[RLLib] Random Parametric Trainer (#11366)
|
2020-11-04 11:12:51 +01:00 |
|
desktable
|
5af745c90d
|
[RLlib] Implement the SlateQ algorithm (#11450)
|
2020-11-03 09:52:04 +01:00 |
|
Lara Codeca
|
e735add268
|
[RLlib] Integration with SUMO Simulator (#11710)
|
2020-11-03 09:45:03 +01:00 |
|
Sven Mika
|
54d85a6c2a
|
[RLlib] Fix RNN learning for tf-eager/tf2.x. (#11720)
|
2020-11-02 11:18:41 +01:00 |
|
Sven Mika
|
8ea1bc5ff9
|
[RLlib] Allow for more than 2^31 policy timesteps. (#11301)
|
2020-10-12 13:49:11 -07:00 |
|
Sven Mika
|
ce96b03b07
|
[RLlib] MB-MPO cleanup (comments, docstrings, type annotations). (#11033)
|
2020-10-06 20:28:16 +02:00 |
|
Sven Mika
|
c17169dc11
|
[RLlib] Fix all example scripts to run on GPUs. (#11105)
|
2020-10-02 23:07:44 +02:00 |
|
Sven Mika
|
36bda8432b
|
[RLlib] Trajectory view API: Simple List Collector (on by default for PPO); LSTM-agnostic (#11056)
|
2020-10-01 16:57:10 +02:00 |
|
Eric Liang
|
daa03ba6e6
|
[rllib] Add execution module to package ref (#10941)
* add init
* add
* update
|
2020-09-21 23:03:06 -07:00 |
|
Sven Mika
|
d7c42d6d92
|
[RLlib] Unity blogpost final fixes. (#10894)
|
2020-09-20 14:13:20 +02:00 |
|
Sven Mika
|
805dad3bc4
|
[RLlib] SAC algo cleanup. (#10825)
|
2020-09-20 11:27:02 +02:00 |
|
Benjamin Black
|
f2408b719c
|
Fixed PettingZooEnv (#10847)
|
2020-09-17 11:28:42 -07:00 |
|
Sven Mika
|
4b278c36fc
|
[RLlib] Behavioral Cloning (from MARWIL). (#10619)
|
2020-09-09 17:33:21 +02:00 |
|
Michael Luo
|
8e613652af
|
[RLLib] MBMPO Fixes (#10296)
|
2020-09-09 09:34:34 +02:00 |
|
Sven Mika
|
28ab797cf5
|
[RLlib] Deprecate old classes, methods, functions, config keys (in prep for RLlib 1.0). (#10544)
|
2020-09-06 10:58:00 +02:00 |
|
Richard Liaw
|
551c597312
|
[tune] API revamp fix (#10518)
|
2020-09-05 15:34:53 -07:00 |
|
Sven Mika
|
244aafdcf8
|
[RLlib] Curiosity enhancements. (#10373)
|
2020-09-05 13:14:24 +02:00 |
|
Justin Terry
|
352718610d
|
Multi-agent Algorithm Documentation Updates (#9722)
|
2020-09-03 22:37:46 -07:00 |
|
Sven Mika
|
715ee8dfc9
|
[RLlib] Issue 10469: Callbacks should receive env idx ... (#10477)
|
2020-09-03 17:27:05 +02:00 |
|
Sven Mika
|
ef18893fb5
|
[RLlib] PPO, APPO, and DD-PPO code cleanup. (#10420)
|
2020-09-02 14:03:01 +02:00 |
|
Michael Luo
|
4e9888ce2f
|
[RLlib] Dreamer (#10172)
|
2020-08-26 13:24:05 +02:00 |
|
Eric Liang
|
deea1861ab
|
[rllib] Try fixing torch GPU and masking errors (#10168)
|
2020-08-25 18:34:19 -07:00 |
|
Benjamin Black
|
2689fb439c
|
Fixed pettingzoo env example (#9973)
|
2020-08-25 13:22:25 +02:00 |
|
Michael Luo
|
48a39d7cb9
|
[RLlib] Deepmind Control Suite Examples (#9751)
|
2020-08-23 12:53:08 +02:00 |
|
Sven Mika
|
e968b52cb7
|
[RLlib] Trajectory view API - 03 Fast LSTM + prev actions/rewards (#9950)
|
2020-08-21 12:35:16 +02:00 |
|
Eric Liang
|
ca133e2699
|
[rllib] Remove extra model config kwargs passed incorrectly for Torch models (#10055)
|
2020-08-17 11:12:20 -07:00 |
|
Olli Huotari
|
9ff599cbb8
|
torch policy now includes model.metrics (#10121)
* torch policy now includes model.metrics
* Fixed tests to work with custom metrics
* Forgot to run format.sh
|
2020-08-15 10:43:11 -07:00 |
|
Barak Michener
|
8e76796fd0
|
ci: Redo format.sh --all script & backfill lint fixes (#9956)
|
2020-08-07 16:49:49 -07:00 |
|
Michael Luo
|
4d7bd8c892
|
[RLlib] Implementation of "Model-based Meta Policy Optimization" (MB MPO) (#9409)
|
2020-08-02 18:12:09 +02:00 |
|
Sven Mika
|
b0b0463161
|
[RLlib] Trajectory View API (preparatory cleanup and enhancements). (#9678)
|
2020-07-29 21:15:09 +02:00 |
|
Sven Mika
|
e6ea33a03c
|
[RLlib] Enhance reward clipping test; add action_clipping tests. (#9684)
|
2020-07-28 10:44:54 +02:00 |
|
Sven Mika
|
5dc4b6686e
|
[RLlib] Implement DQN PyTorch distributional head. (#9589)
|
2020-07-25 09:29:24 +02:00 |
|
Sven Mika
|
78dfed2683
|
[RLlib] Issue 8384: QMIX doesn't learn anything. (#9527)
|
2020-07-17 12:14:34 +02:00 |
|
Sven Mika
|
935d8308fb
|
[RLlib] Issue #9437 (PyTorch converts to CPU tensor, even if on GPU). (#9497)
|
2020-07-16 14:55:50 +02:00 |
|
Sven Mika
|
617eb8f279
|
[RLlib] Issue 9402 MARWIL producing nan rewards. (#9429)
|
2020-07-14 05:07:16 +02:00 |
|
Sven Mika
|
14160ca58c
|
[RLlib] Issue #9366 (DQN w/o dueling produces invalid actions). (#9386)
|
2020-07-10 12:43:03 +02:00 |
|
Benjamin Black
|
1425cdf834
|
Pettingzoo environment support (#9271)
* added pettingzoo wrapper env and example
* added docs, examples for pettingzoo env support
* fixed pettingzoo env flake8, added test
* fixed pettingzoo env import
* fixed pettingzoo env import
* fixed pettingzoo import issue
* fixed pettingzoo test
* fixed linting problem
* fixed bad quotes
* future proofed pettingzoo dependency
* fixed ray init in pettingzoo env
* lint
* manual lint
Co-authored-by: Eric Liang <ekhliang@gmail.com>
|
2020-07-06 21:32:26 -07:00 |
|
Sven Mika
|
f43d934817
|
[RLlib] Type annotations for policy. (#9248)
|
2020-07-05 13:09:51 +02:00 |
|
Michael Luo
|
851d02463b
|
[Doc] RLlib Algorithms Documentation: MAML + PyTorch MAML (#9189)
|
2020-07-03 11:05:15 -07:00 |
|
Sven Mika
|
5b2a97597b
|
[RLlib] Retire try_import_tree (should be installed along with other requirements). (#9211)
- Retire try_import_tree.
- Stabilize test_supported_multi_agent.py.
|
2020-07-02 13:06:34 +02:00 |
|
Sven Mika
|
43043ee4d5
|
[RLlib] Tf2x preparation; part 2 (upgrading try_import_tf() ). (#9136)
* WIP.
* Fixes.
* LINT.
* WIP.
* WIP.
* Fixes.
* Fixes.
* Fixes.
* Fixes.
* WIP.
* Fixes.
* Test
* Fix.
* Fixes and LINT.
* Fixes and LINT.
* LINT.
|
2020-06-30 10:13:20 +02:00 |
|
Sven Mika
|
5c6d5d4ab1
|
This PR fixes the currently broken lstm_use_prev_action_reward flag for default lstm models (model.use_lstm=True). (#8970)
|
2020-06-27 20:50:01 +02:00 |
|
Sven Mika
|
af1203b9df
|
[RLlib] Issue 8507 (PyTorch does not support custom loss). (#9142)
|
2020-06-26 09:52:22 +02:00 |
|