ray/rllib/tuned_examples/mountaincarcontinuous-apex-ddpg.yaml
Sven Mika 83e06cd30a
[RLlib] DDPG refactor and Exploration API action noise classes. (#7314)
* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* Fix

* WIP.

* Add TD3 quick Pendulum regresison.

* Cleanup.

* Fix.

* LINT.

* Fix.

* Sort quick_learning test cases, add TD3.

* Sort quick_learning test cases, add TD3.

* Revert test_checkpoint_restore.py (debugging) changes.

* Fix old soft_q settings in documentation and test configs.

* More doc fixes.

* Fix test case.

* Fix test case.

* Lower test load.

* WIP.
2020-03-01 11:53:35 -08:00

16 lines
496 B
YAML

# This can be expected to reach 90 reward within ~1.5-2.5m timesteps / ~150-250 seconds on a K40 GPU
mountaincarcontinuous-apex-ddpg:
env: MountainCarContinuous-v0
run: APEX_DDPG
stop:
episode_reward_mean: 90
config:
clip_rewards: False
num_workers: 16
exploration_config:
ou_base_scale: 1.0
n_step: 3
target_network_update_freq: 50000
tau: 1.0
evaluation_interval: 5
evaluation_num_episodes: 10