Commit graph

12 commits

Author SHA1 Message Date
Sven Mika
83e06cd30a
[RLlib] DDPG refactor and Exploration API action noise classes. (#7314)
* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* Fix

* WIP.

* Add TD3 quick Pendulum regresison.

* Cleanup.

* Fix.

* LINT.

* Fix.

* Sort quick_learning test cases, add TD3.

* Sort quick_learning test cases, add TD3.

* Revert test_checkpoint_restore.py (debugging) changes.

* Fix old soft_q settings in documentation and test configs.

* More doc fixes.

* Fix test case.

* Fix test case.

* Lower test load.

* WIP.
2020-03-01 11:53:35 -08:00
Eric Liang
fbc545c03b
[rllib] Support parallel, parameterized evaluation (#6981)
* eval api

* update

* sync eval filters

* sync fix

* docs

* update

* docs

* update

* link

* nit

* doc updates

* format
2020-02-01 22:12:12 -08:00
Eric Liang
8fc2272f43
[rllib] Reorganize trainer config, add warnings about high VF loss magnitude for PPO (#6181) 2019-11-18 10:39:07 -08:00
Eric Liang
5d7afe8092
[rllib] Try moving RLlib to top level dir (#5324) 2019-08-05 23:25:49 -07:00
Eric Liang
02583a8598 [rllib] Rename PolicyGraph => Policy, move from evaluation/ to policy/ (#4819)
This implements some of the renames proposed in #4813
We leave behind backwards-compatibility aliases for *PolicyGraph and SampleBatch.
2019-05-20 16:46:05 -07:00
Eric Liang
37208216ae
[rllib] Rename Agent to Trainer (#4556) 2019-04-07 00:36:18 -07:00
Eric Liang
4b8b703561
[rllib] Some API cleanups and documentation improvements (#4409) 2019-03-21 21:34:22 -07:00
Eric Liang
d9da183c7d
[rllib] Custom supervised loss API (#4083) 2019-02-24 15:36:13 -08:00
Eric Liang
2dccf383dd
[rllib] Basic infrastructure for off-policy estimation (IS, WIS) (#3941) 2019-02-13 16:25:05 -08:00
Eric Liang
fb73cedf70
[rllib] Add examples page, add hierarchical training example, delete SC2 examples (#3815)
* wip

* lint

* wip

* up

* wip

* update examples

* wip

* remove carla

* update

* improve envspec

* link to custom

* Update rllib-env.rst

* update

* fix

* fn

* lint

* ds

* ssd games

* desc

* fix up docs

* fix
2019-01-29 21:06:09 -08:00
Eric Liang
03fe760616
[rllib] Model self loss isn't included in all algorithms (#3679) 2019-01-04 22:30:35 -08:00
Eric Liang
ca864faece
[rllib] Documentation for I/O API and multi-agent support / cleanup (#3650) 2019-01-03 15:15:36 +08:00