ray/release/rllib_tests/learning_tests/yaml_files/sac/sac-halfcheetahbulletenv-v0.yaml

sac-halfcheetahbulletenv-v0:
    env: HalfCheetahBulletEnv-v0
    run: SAC
    # Minimum reward and total ts (in given time_total_s) to pass this test.
    pass_criteria:
        episode_reward_mean: 400.0
        timesteps_total: 200000
    stop:
        time_total_s: 1800
    config:
        horizon: 1000
        soft_horizon: false
        q_model_config:
            fcnet_activation: relu
            fcnet_hiddens: [256, 256]
        policy_model_config:
            fcnet_activation: relu
            fcnet_hiddens: [256, 256]
        tau: 0.005
        target_entropy: auto
        no_done_at_end: false
        n_step: 3
        rollout_fragment_length: 1
        train_batch_size: 256
        target_network_update_freq: 1
        min_sample_timesteps_per_iteration: 1000
        replay_buffer_config:
          learning_starts: 10000
          type: MultiAgentPrioritizedReplayBuffer
        optimization:
            actor_learning_rate: 0.0003
            critic_learning_rate: 0.0003
            entropy_learning_rate: 0.0003
        num_workers: 0
        num_gpus: 1
        metrics_smoothing_episodes: 5
[RLlib] Re-establish dashboard performance tests. (#24728) 2022-05-16 13:13:49 +02:00			`sac-halfcheetahbulletenv-v0:`
			`env: HalfCheetahBulletEnv-v0`
			`run: SAC`
			`# Minimum reward and total ts (in given time_total_s) to pass this test.`
			`pass_criteria:`
			`episode_reward_mean: 400.0`
			`timesteps_total: 200000`
			`stop:`
[RLlib] Fix the 2 failing RLlib release tests. (#25603) 2022-06-14 05:51:08 -07:00			`time_total_s: 1800`
[RLlib] Re-establish dashboard performance tests. (#24728) 2022-05-16 13:13:49 +02:00			`config:`
			`horizon: 1000`
			`soft_horizon: false`
[RLlib] SAC, RNNSAC, and CQL TrainerConfig objects (#25059) 2022-05-22 18:58:47 +01:00			`q_model_config:`
[RLlib] Re-establish dashboard performance tests. (#24728) 2022-05-16 13:13:49 +02:00			`fcnet_activation: relu`
			`fcnet_hiddens: [256, 256]`
[RLlib] SAC, RNNSAC, and CQL TrainerConfig objects (#25059) 2022-05-22 18:58:47 +01:00			`policy_model_config:`
[RLlib] Re-establish dashboard performance tests. (#24728) 2022-05-16 13:13:49 +02:00			`fcnet_activation: relu`
			`fcnet_hiddens: [256, 256]`
			`tau: 0.005`
			`target_entropy: auto`
			`no_done_at_end: false`
			`n_step: 3`
			`rollout_fragment_length: 1`
			`train_batch_size: 256`
			`target_network_update_freq: 1`
[RLlib] Trainer.training_iteration -> Trainer.training_step; Iterations vs reportings: Clarification of terms. (#25076) 2022-06-10 17:09:18 +02:00			`min_sample_timesteps_per_iteration: 1000`
[RLlib] Replay Buffer API and Ape-X. (#24506) 2022-05-17 13:43:49 +02:00			`replay_buffer_config:`
			`learning_starts: 10000`
			`type: MultiAgentPrioritizedReplayBuffer`
[RLlib] Re-establish dashboard performance tests. (#24728) 2022-05-16 13:13:49 +02:00			`optimization:`
			`actor_learning_rate: 0.0003`
			`critic_learning_rate: 0.0003`
			`entropy_learning_rate: 0.0003`
			`num_workers: 0`
			`num_gpus: 1`
			`metrics_smoothing_episodes: 5`