ray/rllib/tuned_examples/regression_tests/pendulum-ddpg-tf.yaml at 3812bfedda7c10bd8e5ead343e02577bfe159728 - hiro/ray - Forgejo: Beyond coding. We Forge.

hiro/ray

mirror of https://github.com/vale981/ray synced 2025-03-09 21:06:39 -04:00

Sven Mika d0fab84e4d

[RLlib] DDPG PyTorch version. (#7953 )

The DDPG/TD3 algorithms currently do not have a PyTorch implementation. This PR adds PyTorch support for DDPG/TD3 to RLlib.
This PR:
- Depends on the re-factor PR for DDPG (Functional Algorithm API).
- Adds learning regression tests for the PyTorch version of DDPG and a DDPG (torch)
- Updates the documentation to reflect that DDPG and TD3 now support PyTorch.

* Learning Pendulum-v0 on torch version (same config as tf). Wall time a little slower (~20% than tf).
* Fix GPU target model problem.

2020-04-16 10:20:01 +02:00

10 lines

220 B

YAML

Raw Blame History

 pendulum-ddpg-tf:
     env: Pendulum-v0
     run: DDPG
     stop:
         episode_reward_mean: -900
         timesteps_total: 100000
     config:
         use_pytorch: false
         use_huber: true
         clip_rewards: false