ray/rllib/tuned_examples/dqn/stateless-cartpole-r2d2.yaml

stateless-cartpole-r2d2:
    env: ray.rllib.examples.env.stateless_cartpole.StatelessCartPole
    run: R2D2
    stop:
        episode_reward_mean: 150
        timesteps_total: 1000000
    config:
        # Works for both torch and tf.
        framework: tf
        num_workers: 0
        # R2D2 settings.
        burn_in: 20
        zero_init_states: true
        #dueling: false
        lr: 0.0005
        # Give some more time to explore.
        exploration_config:
          epsilon_timesteps: 50000
        # Wrap with an LSTM and use a very simple base-model.
        model:
            fcnet_hiddens: [64]
            fcnet_activation: linear
            use_lstm: true
            lstm_cell_size: 64
            max_seq_len: 20
[RLlib] Move existing fake multi-GPU learning tests into separate buildkite job. (#18065) 2021-08-31 14:56:53 +02:00			`stateless-cartpole-r2d2:`
[RLlib] Issue 15556: Fix R2D2 using chunks from previous episodes in the "burn-in" window. (#15737) 2021-05-18 11:05:42 +02:00			`env: ray.rllib.examples.env.stateless_cartpole.StatelessCartPole`
[RLlib] R2D2 Implementation. (#13933) 2021-02-25 12:18:11 +01:00			`run: R2D2`
			`stop:`
			`episode_reward_mean: 150`
			`timesteps_total: 1000000`
			`config:`
			`# Works for both torch and tf.`
			`framework: tf`
			`num_workers: 0`
			`# R2D2 settings.`
			`burn_in: 20`
			`zero_init_states: true`
[RLlib] Move existing fake multi-GPU learning tests into separate buildkite job. (#18065) 2021-08-31 14:56:53 +02:00			`#dueling: false`
[RLlib] R2D2 Implementation. (#13933) 2021-02-25 12:18:11 +01:00			`lr: 0.0005`
			`# Give some more time to explore.`
			`exploration_config:`
[RLlib] Move existing fake multi-GPU learning tests into separate buildkite job. (#18065) 2021-08-31 14:56:53 +02:00			`epsilon_timesteps: 50000`
[RLlib] R2D2 Implementation. (#13933) 2021-02-25 12:18:11 +01:00			`# Wrap with an LSTM and use a very simple base-model.`
			`model:`
[RLlib] Move existing fake multi-GPU learning tests into separate buildkite job. (#18065) 2021-08-31 14:56:53 +02:00			`fcnet_hiddens: [64]`
			`fcnet_activation: linear`
[RLlib] R2D2 Implementation. (#13933) 2021-02-25 12:18:11 +01:00			`use_lstm: true`
			`lstm_cell_size: 64`
			`max_seq_len: 20`