ray/rllib/tuned_examples/pong-impala.yaml

# This can reach 18-19 reward within 10 minutes on a Tesla M60 GPU (e.g., G3 EC2 node):
#   128 workers -> 8 minutes
#    32 workers -> 17 minutes
#    16 workers -> 40 min+
# See also: pong-impala-fast.yaml, pong-impala-vectorized.yaml
pong-impala:
    env: PongNoFrameskip-v4
    run: IMPALA
    config:
        rollout_fragment_length: 50
        train_batch_size: 500
        num_workers: 128
        num_envs_per_worker: 1
[rllib] Basic IMPALA implementation (using deepmind's reference vtrace.py) (#2504) Rename AsyncSamplesOptimizer -> AsyncReplayOptimizer Add AsyncSamplesOptimizer that implements the IMPALA architecture integrate V-trace with a3c policy graph audit V-trace integration benchmark compare vs A3C and with V-trace on/off PongNoFrameskip-v4 on IMPALA scaling from 16 to 128 workers, solving Pong in <10 min. For reference, solving this env takes ~40 minutes for Ape-X and several hours for A3C. 2018-08-01 20:53:53 -07:00			`# This can reach 18-19 reward within 10 minutes on a Tesla M60 GPU (e.g., G3 EC2 node):`
			`# 128 workers -> 8 minutes`
			`# 32 workers -> 17 minutes`
			`# 16 workers -> 40 min+`
[rllib] Update multi-gpu impala numbers (#3327) 2018-11-19 20:55:27 -08:00			`# See also: pong-impala-fast.yaml, pong-impala-vectorized.yaml`
[rllib] Basic IMPALA implementation (using deepmind's reference vtrace.py) (#2504) Rename AsyncSamplesOptimizer -> AsyncReplayOptimizer Add AsyncSamplesOptimizer that implements the IMPALA architecture integrate V-trace with a3c policy graph audit V-trace integration benchmark compare vs A3C and with V-trace on/off PongNoFrameskip-v4 on IMPALA scaling from 16 to 128 workers, solving Pong in <10 min. For reference, solving this env takes ~40 minutes for Ape-X and several hours for A3C. 2018-08-01 20:53:53 -07:00			`pong-impala:`
			`env: PongNoFrameskip-v4`
			`run: IMPALA`
			`config:`
[rllib] Rename sample_batch_size => rollout_fragment_length (#7503) * bulk rename * deprecation warn * update doc * update fig * line length * rename * make pytest comptaible * fix test * fi sys * rename * wip * fix more * lint * update svg * comments * lint * fix use of batch steps 2020-03-14 12:05:04 -07:00			`rollout_fragment_length: 50`
[rllib] Basic IMPALA implementation (using deepmind's reference vtrace.py) (#2504) Rename AsyncSamplesOptimizer -> AsyncReplayOptimizer Add AsyncSamplesOptimizer that implements the IMPALA architecture integrate V-trace with a3c policy graph audit V-trace integration benchmark compare vs A3C and with V-trace on/off PongNoFrameskip-v4 on IMPALA scaling from 16 to 128 workers, solving Pong in <10 min. For reference, solving this env takes ~40 minutes for Ape-X and several hours for A3C. 2018-08-01 20:53:53 -07:00			`train_batch_size: 500`
			`num_workers: 128`
			`num_envs_per_worker: 1`