Commit graph

25 commits

Author SHA1 Message Date
Sven Mika
0b308719f8
[RLlib; Docs overhaul] Docstring cleanup: rllib/utils (#19829) 2021-11-01 21:46:02 +01:00
Sven Mika
b4300dd532
[RLlib] Issue 18812: Torch multi-GPU stats not protected against race conditions. (#18937) 2021-10-04 13:29:00 +02:00
Sven Mika
c2ea2c01bb
[RLlib] Redo: Add support for multi-GPU to DDPG. (#17789)
* wip.

* wip.

* wip.

* wip.

* wip.

* wip.
2021-08-13 18:01:24 -07:00
Amog Kamsetty
0b8489dcc6
Revert "[RLlib] Add support for multi-GPU to DDPG. (#17586)" (#17707)
This reverts commit 0eb0e0ff58.
2021-08-10 10:50:21 -07:00
Sven Mika
0eb0e0ff58
[RLlib] Add support for multi-GPU to DDPG. (#17586) 2021-08-05 11:39:51 -04:00
Sven Mika
924f11cd45
[RLlib] Torch algos use now-framework-agnostic MultiGPUTrainOneStep execution op (~33% speedup for PPO-torch + GPU). (#17371) 2021-08-03 11:35:49 -04:00
Michael Luo
474f04e322
[RLlib] DDPG/TD3 + A3C/A2C + MARWIL/BC Annotation/Comments/Code Cleanup (#14707) 2021-05-19 16:32:29 +02:00
Sven Mika
839fc59224
[RLlib] CQL TensorFlow support (#15841) 2021-05-18 11:10:46 +02:00
SebastianBo1995
f5be8d8f74
[Rllib] Offline Learning Bug, different shapes (#15132) 2021-04-27 17:18:17 +02:00
Sven Mika
04bc0a9828
[RLlib] Remove all non-trajectory view API code. (#14860) 2021-03-23 09:50:18 -07:00
Sven Mika
732197e23a
[RLlib] Multi-GPU for tf-DQN/PG/A2C. (#13393) 2021-03-08 15:41:27 +01:00
Sven Mika
37c7daa3c0
[RLlib] DDPG: Support simplex action space. (#14011) 2021-02-10 15:10:01 +01:00
Sven Mika
fb318addcb
[RLlib] Curiosity exploration module: tf/tf2.x/tf-eager support. (#11945) 2020-11-29 12:31:24 +01:00
mvindiola1
2b893d1bb5
fix incorrect critic loss in TD3 (#10775)
Co-authored-by: Manny Vindiola <manuel.m.vindiola.civ@mail.mil>
2020-09-20 20:01:51 -07:00
maxco2
b8436f0f00
[rllib] Fix SAC and DDPG tensorflow policy can't do grad_clip (#10499)
* Fix sac_tf_policy clip_by_norm missing argument

* Fix ddpg_tf_policy clip_by_norm missing argument

* Fix format
2020-09-11 12:04:44 -07:00
Barak Michener
8e76796fd0
ci: Redo format.sh --all script & backfill lint fixes (#9956) 2020-08-07 16:49:49 -07:00
Sven Mika
fcdf410ae1
[RLlib] Tf2.x native. (#8752) 2020-07-11 22:06:35 +02:00
Sven Mika
4da0e542d5
[RLlib] DDPG and SAC eager support (preparation for tf2.x) (#9204) 2020-07-08 16:12:20 +02:00
Sven Mika
43043ee4d5
[RLlib] Tf2x preparation; part 2 (upgrading try_import_tf()). (#9136)
* WIP.

* Fixes.

* LINT.

* WIP.

* WIP.

* Fixes.

* Fixes.

* Fixes.

* Fixes.

* WIP.

* Fixes.

* Test

* Fix.

* Fixes and LINT.

* Fixes and LINT.

* LINT.
2020-06-30 10:13:20 +02:00
Sven Mika
4fd8977eaf
[RLlib] Minor cleanup in preparation to tf2.x support. (#9130)
* WIP.

* Fixes.

* LINT.

* Fixes.

* Fixes and LINT.

* WIP.
2020-06-25 19:01:32 +02:00
Sven Mika
7008902cff
[RLlib] Minor rllib.utils cleanup. (#8932) 2020-06-16 08:52:20 +02:00
Sven Mika
2746fc0476
[RLlib] Auto-framework, retire use_pytorch in favor of framework=... (#8520) 2020-05-27 16:19:13 +02:00
Eric Liang
b14cc16616
[rllib] Enable functional execution workflow API by default (#8221) 2020-05-05 12:36:42 -07:00
Sven Mika
7ec2223c84
[RLlib] DDPG PyTorch actor-model was missing sigmoid layer (#8188)
Fix DDPG PyTorch (missing sigmoid layer (to squash action outputs) after deterministic action outputs).
2020-04-26 23:08:13 +02:00
Sven Mika
d0fab84e4d
[RLlib] DDPG PyTorch version. (#7953)
The DDPG/TD3 algorithms currently do not have a PyTorch implementation. This PR adds PyTorch support for DDPG/TD3 to RLlib.
This PR:
- Depends on the re-factor PR for DDPG (Functional Algorithm API).
- Adds learning regression tests for the PyTorch version of DDPG and a DDPG (torch)
- Updates the documentation to reflect that DDPG and TD3 now support PyTorch.

* Learning Pendulum-v0 on torch version (same config as tf). Wall time a little slower (~20% than tf).
* Fix GPU target model problem.
2020-04-16 10:20:01 +02:00