ray/rllib/tuned_examples/regression_tests/cartpole-ppo-tf.yaml at 584645cc7da2bfd7d341d52b59c9c8561dbd119b - hiro/ray - Forgejo: Beyond coding. We Forge.

hiro/ray

mirror of https://github.com/vale981/ray synced 2025-03-08 19:41:38 -05:00

Sven Mika c957ed58ed [RLlib] Implement PPO torch version. (#6826 )

2020-01-20 23:06:50 -08:00

10 lines

241 B

YAML

Raw Blame History

 cartpole-ppo-tf:
     env: CartPole-v0
     run: PPO
     stop:
         episode_reward_mean: 150
         timesteps_total: 100000
     config:
         num_workers: 1
         batch_mode: complete_episodes
         observation_filter: MeanStdFilter