Commit graph

13 commits

Author SHA1 Message Date
Sven Mika
eb0038612f
[RLlib] Extend on_learn_on_batch callback to allow for custom metrics to be added. (#13584) 2021-02-08 15:02:19 +01:00
Sven Mika
a65ee92b69
[RLlib] MARWIL loss function test case and cleanup. (#13455) 2021-01-19 09:51:05 +01:00
Sven Mika
c524f86785
[RLlib] BC/MARWIL/recurrent nets minor cleanups and bug fixes. (#13064) 2020-12-27 09:46:03 -05:00
Sven Mika
bfc4f95e01
[RLlib] Fix test_bc.py test case. (#11722)
* Fix large json test file.

* Fix large json test file.

* WIP.
2020-10-31 00:16:09 -07:00
Sven Mika
ce96b03b07
[RLlib] MB-MPO cleanup (comments, docstrings, type annotations). (#11033) 2020-10-06 20:28:16 +02:00
Sven Mika
4b278c36fc
[RLlib] Behavioral Cloning (from MARWIL). (#10619) 2020-09-09 17:33:21 +02:00
Barak Michener
8e76796fd0
ci: Redo format.sh --all script & backfill lint fixes (#9956) 2020-08-07 16:49:49 -07:00
Sven Mika
617eb8f279
[RLlib] Issue 9402 MARWIL producing nan rewards. (#9429) 2020-07-14 05:07:16 +02:00
Sven Mika
43043ee4d5
[RLlib] Tf2x preparation; part 2 (upgrading try_import_tf()). (#9136)
* WIP.

* Fixes.

* LINT.

* WIP.

* WIP.

* Fixes.

* Fixes.

* Fixes.

* Fixes.

* WIP.

* Fixes.

* Test

* Fix.

* Fixes and LINT.

* Fixes and LINT.

* LINT.
2020-06-30 10:13:20 +02:00
Sven Mika
4ed796a7d6
[RLlib] Add testing Policy.compute_single_action() for all agents. (#8903) 2020-06-13 17:51:50 +02:00
Eric Liang
9a83908c46
[rllib] Deprecate policy optimizers (#8345) 2020-05-21 10:16:18 -07:00
Sven Mika
754290daad
[RLlib] Add light-weight Trainer.compute_action() tests for all Algos. (#8356) 2020-05-08 16:31:31 +02:00
Sven Mika
c2cb5c2214
[RLlib] MARWIL torch. (#7836)
* WIP.

* WIP.

* LINT.

* Fix MARWIL so it can run with eager-mode.

* LINT.
2020-04-06 16:38:50 -07:00