Sven Mika
|
617eb8f279
|
[RLlib] Issue 9402 MARWIL producing nan rewards. (#9429)
|
2020-07-14 05:07:16 +02:00 |
|
Sven Mika
|
43043ee4d5
|
[RLlib] Tf2x preparation; part 2 (upgrading try_import_tf() ). (#9136)
* WIP.
* Fixes.
* LINT.
* WIP.
* WIP.
* Fixes.
* Fixes.
* Fixes.
* Fixes.
* WIP.
* Fixes.
* Test
* Fix.
* Fixes and LINT.
* Fixes and LINT.
* LINT.
|
2020-06-30 10:13:20 +02:00 |
|
Sven Mika
|
4ed796a7d6
|
[RLlib] Add testing Policy.compute_single_action() for all agents. (#8903)
|
2020-06-13 17:51:50 +02:00 |
|
Eric Liang
|
9a83908c46
|
[rllib] Deprecate policy optimizers (#8345)
|
2020-05-21 10:16:18 -07:00 |
|
Sven Mika
|
754290daad
|
[RLlib] Add light-weight Trainer.compute_action() tests for all Algos. (#8356)
|
2020-05-08 16:31:31 +02:00 |
|
Sven Mika
|
c2cb5c2214
|
[RLlib] MARWIL torch. (#7836)
* WIP.
* WIP.
* LINT.
* Fix MARWIL so it can run with eager-mode.
* LINT.
|
2020-04-06 16:38:50 -07:00 |
|