Commit graph

9 commits

Author SHA1 Message Date
Sven Mika
b2bcab711d
[RLlib] Attention Nets: tf (#12753) 2020-12-20 20:22:32 -05:00
Sven Mika
0df55a139c
[RLlib] Attention Net prep PR #1: Smaller cleanups. (#12447)
* WIP.

* Fix.

* Fix.

* Fix.
2020-11-27 16:25:47 -08:00
Sven Mika
805dad3bc4
[RLlib] SAC algo cleanup. (#10825) 2020-09-20 11:27:02 +02:00
Sven Mika
e968b52cb7
[RLlib] Trajectory view API - 03 Fast LSTM + prev actions/rewards (#9950) 2020-08-21 12:35:16 +02:00
Sven Mika
c9435cad43
WIP. (#8456)
Fix multi-GPU histogram metrics for > 0D tensors.
2020-05-15 21:43:27 +02:00
Sven Mika
66df8b8c35
[RLlib] Working/learning example: PPO + torch + LSTM. (#7797) 2020-03-31 22:00:28 -07:00
Eric Liang
9a590ac6a5
[rllib] Fix custom model metrics in multi-device case (#7640)
* fix example

* add example test

* lin
2020-03-23 12:40:22 -07:00
Sven Mika
d537e9f0d8
[RLlib] Exploration API: merge deterministic flag with exploration classes (SoftQ and StochasticSampling). (#7155) 2020-02-19 12:18:45 -08:00
Eric Liang
2fb53396ad [rllib] [experimental] Decentralized Distributed PPO for torch (DD-PPO) (#6918) 2020-01-25 22:36:43 -08:00