Sven Mika
|
407a3523f3
|
[RLlib] eval_workers after restore not generated in Trainer due to unintuitive config handling. (#12844)
|
2020-12-20 09:37:31 -05:00 |
|
Sven Mika
|
ea25482f6a
|
WIP. (#12706)
|
2020-12-09 11:49:21 -08:00 |
|
Sven Mika
|
e40b14d255
|
[RLlib] Batch-size for truncate_episode batch_mode should be confgurable in agent-steps (rather than env-steps), if needed. (#12420)
|
2020-12-08 16:41:45 -08:00 |
|
Sven Mika
|
99c81c6795
|
[RLlib] Attention Net prep PR #3. (#12450)
|
2020-12-07 13:08:17 +01:00 |
|
Sven Mika
|
19c8033df2
|
[RLlib] Fix most remaining RLlib algos for running with trajectory view API. (#12366)
* WIP.
* WIP.
* WIP.
* WIP.
* WIP.
* WIP.
* WIP.
* WIP.
* WIP.
* WIP.
* WIP.
* WIP.
* WIP.
* WIP.
* WIP.
* WIP.
* WIP.
* WIP.
* WIP.
* WIP.
* WIP.
* WIP.
* WIP.
* WIP.
* WIP.
* WIP.
* LINT and fixes.
MB-MPO and MAML not working yet.
* wip
* update
* update
* rmeove
* remove dep
* higher
* Update requirements_rllib.txt
* Update requirements_rllib.txt
* relpos
* no mbmpo
Co-authored-by: Eric Liang <ekhliang@gmail.com>
|
2020-12-01 17:41:10 -08:00 |
|
Sven Mika
|
bb03e2499b
|
[RLlib] PyBullet Env native support via env str-specifier (if installed). (#12209)
|
2020-11-30 12:41:24 +01:00 |
|
Sven Mika
|
fb318addcb
|
[RLlib] Curiosity exploration module: tf/tf2.x/tf-eager support. (#11945)
|
2020-11-29 12:31:24 +01:00 |
|
Pierre TASSEL
|
60a545ab57
|
[RLLib] Fix HyperOptSearch tuple to list conversion (#12462)
Co-authored-by: Sumanth Ratna <sumanthratna@gmail.com>
|
2020-11-28 10:07:54 -08:00 |
|
Sven Mika
|
0df55a139c
|
[RLlib] Attention Net prep PR #1: Smaller cleanups. (#12447)
* WIP.
* Fix.
* Fix.
* Fix.
|
2020-11-27 16:25:47 -08:00 |
|
Sven Mika
|
6475297bd3
|
[RLlib] Torch LR schedule not working. Fix and added test case. (#12396)
|
2020-11-26 13:14:11 +01:00 |
|
Sven Mika
|
b7dbbfbf41
|
[RLlib] Issue 11591: SAC loss does not use PR-weights in critic loss term. (#12394)
* WIP.
* Fix and LINT.
|
2020-11-25 11:28:46 -08:00 |
|
Sven Mika
|
592c161032
|
[RLlib] Issue 12118: LSTM prev-a/r should be separately configurable. Fix missing prev-a one-hot encoding. (#12397)
* WIP.
* Fix and LINT.
|
2020-11-25 11:27:46 -08:00 |
|
Sven Mika
|
dab241dcc6
|
[RLlib] Fix inconsistency wrt batch size in SampleCollector (traj. view API). Makes DD-PPO work with traj. view API. (#12063)
|
2020-11-19 19:01:14 +01:00 |
|
Sven Mika
|
6da4342822
|
[RLlib] Add on_learn_on_batch (Policy) callback to DefaultCallbacks. (#12070)
|
2020-11-18 15:39:23 +01:00 |
|
Sven Mika
|
b6b54f1c81
|
[RLlib] Trajectory view API: enable by default for SAC, DDPG, DQN, SimpleQ (#11827)
|
2020-11-16 10:54:35 -08:00 |
|
Michael Luo
|
59bc1e6c09
|
[RLLib] MAML extension for all models except RNNs (#11337)
|
2020-11-12 16:51:40 -08:00 |
|
Sven Mika
|
0bd69edd71
|
[RLlib] Trajectory view API: enable by default for ES and ARS (#11826)
|
2020-11-12 10:33:10 -08:00 |
|
Michael Luo
|
6e6c680f14
|
MBMPO Cartpole (#11832)
* MBMPO Cartpole Done
* Added doc
|
2020-11-12 10:30:41 -08:00 |
|
Sven Mika
|
62c7ab5182
|
[RLlib] Trajectory view API: Enable by default for PPO, IMPALA, PG, A3C (tf and torch). (#11747)
|
2020-11-12 16:27:34 +01:00 |
|
Sven Mika
|
291c172d83
|
[RLlib] Support Simplex action spaces for SAC (torch and tf). (#11909)
|
2020-11-11 18:45:28 +01:00 |
|
Eric Liang
|
9b8218aabd
|
[docs] Move all /latest links to /master (#11897)
* use master link
* remae
* revert non-ray
* more
* mre
|
2020-11-10 10:53:28 -08:00 |
|
Eric Liang
|
6b7a4dfaa0
|
[rllib] Forgot to pass ioctx to child json readers (#11839)
* fix ioctx
* fix
|
2020-11-05 22:07:57 -08:00 |
|
Sven Mika
|
d6c7c7c675
|
[RLlib] Make sure, DQN torch actions are of type=long before torch.nn.functional.one_hot() op. (#11800)
|
2020-11-04 18:04:03 +01:00 |
|
desktable
|
5af745c90d
|
[RLlib] Implement the SlateQ algorithm (#11450)
|
2020-11-03 09:52:04 +01:00 |
|
Sven Mika
|
bfc4f95e01
|
[RLlib] Fix test_bc.py test case. (#11722)
* Fix large json test file.
* Fix large json test file.
* WIP.
|
2020-10-31 00:16:09 -07:00 |
|
Sven Mika
|
d9f1874e34
|
[RLlib] Minor fixes (torch GPU bugs + some cleanup). (#11609)
|
2020-10-27 10:00:24 +01:00 |
|
Kingsley Kuan
|
d1dd5d578e
|
[RLlib] Fix PyTorch A3C / A2C loss function using mixed reduced sum / mean (#11449)
|
2020-10-22 12:39:34 -07:00 |
|
Philsik Chang
|
ede9347127
|
[rllib] Add torch_distributed_backend flag for DDPPO (#11362) (#11425)
|
2020-10-21 18:30:42 -07:00 |
|
Eric Liang
|
e8c77e2847
|
Remove memory quota enforcement from actors (#11480)
* wip
* fix
* deprecate
|
2020-10-21 14:29:03 -07:00 |
|
Sven Mika
|
414041c6dd
|
[RLlib] Do not create env on driver iff num_workers > 0. (#11307)
|
2020-10-15 18:21:30 +02:00 |
|
Sven Mika
|
0c0f67c14d
|
[RLlib] ARS/ES eval workers not working: Issue 9933. (#11308)
|
2020-10-12 13:49:48 -07:00 |
|
Sven Mika
|
f5e2cda68a
|
[RLlib] SAC: log_alpha not being learnt when on GPU. (#11298)
|
2020-10-12 13:48:44 -07:00 |
|
Julius Frost
|
7dcfd258cd
|
[RLlib] Assert LongTensor in SAC Discrete PyTorch (#11245)
|
2020-10-12 13:47:21 -07:00 |
|
Sven Mika
|
d3bc20b727
|
[RLlib] ConvTranspose2D module (#11231)
|
2020-10-12 15:00:42 +02:00 |
|
Sumanth Ratna
|
14d8826e43
|
Fix overriden typo (#11227)
|
2020-10-07 19:11:07 -07:00 |
|
Sven Mika
|
ce96b03b07
|
[RLlib] MB-MPO cleanup (comments, docstrings, type annotations). (#11033)
|
2020-10-06 20:28:16 +02:00 |
|
Sven Mika
|
c17169dc11
|
[RLlib] Fix all example scripts to run on GPUs. (#11105)
|
2020-10-02 23:07:44 +02:00 |
|
Michael Luo
|
47b499d899
|
Cartpole MAML + Discrete (#11028)
|
2020-10-02 12:56:34 +02:00 |
|
Sven Mika
|
36bda8432b
|
[RLlib] Trajectory view API: Simple List Collector (on by default for PPO); LSTM-agnostic (#11056)
|
2020-10-01 16:57:10 +02:00 |
|
Eric Liang
|
ecdaaffc67
|
add large data warning (#10957)
|
2020-09-23 15:46:06 -07:00 |
|
Michael Luo
|
ba5a3ae9e2
|
Enable vtrace by default (#10962)
|
2020-09-22 22:18:21 -07:00 |
|
mvindiola1
|
2b893d1bb5
|
fix incorrect critic loss in TD3 (#10775)
Co-authored-by: Manny Vindiola <manuel.m.vindiola.civ@mail.mil>
|
2020-09-20 20:01:51 -07:00 |
|
Sven Mika
|
805dad3bc4
|
[RLlib] SAC algo cleanup. (#10825)
|
2020-09-20 11:27:02 +02:00 |
|
Sumanth Ratna
|
9da7bdcc8e
|
Use master for links to docs in source (#10866)
|
2020-09-19 00:30:45 -07:00 |
|
Eric Liang
|
f83c588f08
|
[rllib] Remove broken no eager on workers mode (#10745)
* remove no eager
* Update trainer.py
|
2020-09-15 17:25:20 -07:00 |
|
desktable
|
4ccfd07a61
|
[RLlib] Add docstrings for agents/dqn (#10710)
|
2020-09-15 12:37:07 +02:00 |
|
maxco2
|
b8436f0f00
|
[rllib] Fix SAC and DDPG tensorflow policy can't do grad_clip (#10499)
* Fix sac_tf_policy clip_by_norm missing argument
* Fix ddpg_tf_policy clip_by_norm missing argument
* Fix format
|
2020-09-11 12:04:44 -07:00 |
|
Julius Frost
|
e72838c03d
|
[RLLib] Add missing .to() for MARWIL on PyTorch (#10685)
There was a missing .to() that caused a device mismatch error on PyTorch with MARWIL.
|
2020-09-09 18:52:55 -07:00 |
|
desktable
|
799318d7d7
|
[RLlib] Add type annotations for agents/dqn (#10626)
|
2020-09-09 18:55:26 +02:00 |
|
Sven Mika
|
4b278c36fc
|
[RLlib] Behavioral Cloning (from MARWIL). (#10619)
|
2020-09-09 17:33:21 +02:00 |
|