ray/python
Sam Toyer 663e92ab3f [rllib] TD3/DDPG improvements and MuJoCo benchmarks (#4694)
* [rllib] Separate optimisers for DDPG actor & crit.

* [rllib] Better names for DDPG variables & options

Config changes:

- noise_scale -> exploration_ou_noise_scale
- exploration_theta -> exploration_ou_theta
- exploration_sigma -> exploration_ou_sigma
- act_noise -> exploration_gaussian_sigma
- noise_clip -> target_noise_clip

* [rllib] Make DDPG less class-y

Used functions to replace three classes with only an __init__ method & a
handful of unrelated attributes.

* [rllib] Refactor DDPG noise

* [rllib] Unify DDPG exploration annealing

Added option "exploration_should_anneal" to enable linear annealing of
exploration noise. By default this is off, for consistency with DDPG &
TD3 papers. Also renamed "exploration_final_eps" to
"exploration_final_scale" (that name seems to have been carried over
from DQN, and doesn't really make sense here). Finally, tried to rename
"eps" to "noise_scale" wherever possible.
2019-04-26 17:49:53 -07:00
..
benchmarks Change timeout from milliseconds to seconds in ray.wait. (#3706) 2019-01-08 21:32:08 -08:00
ray [rllib] TD3/DDPG improvements and MuJoCo benchmarks (#4694) 2019-04-26 17:49:53 -07:00
asv.conf.json [asv] Pushing to s3 (#2246) 2018-06-20 10:43:44 -07:00
build-wheel-macos.sh Build wheels for macOS with Bazel (#4280) 2019-03-15 10:37:57 -07:00
build-wheel-manylinux1.sh Build wheels for macOS with Bazel (#4280) 2019-03-15 10:37:57 -07:00
README-benchmarks.rst [rllib][asv] Support ASV for RLlib (#2304) 2018-06-28 17:20:09 -07:00
README-building-wheels.md fix wheel building doc (#4360) 2019-03-13 23:11:30 -07:00
setup.py Remove CMake files (#4493) 2019-04-02 22:17:33 -07:00