ray/rllib/contrib/alpha_zero/core at 3e038aebb2fab88a623a165988d13fb5e589a904 - hiro/ray

hiro/ray

mirror of https://github.com/vale981/ray synced 2025-03-10 05:16:49 -04:00

History

gjoliver 99a0088233 [RLlib] Unify the way we create local replay buffer for all agents (#19627 ) * [RLlib] Unify the way we create and use LocalReplayBuffer for all the agents. This change 1. Get rid of the try...except clause when we call execution_plan(), and get rid of the Deprecation warning as a result. 2. Fix the execution_plan() call in Trainer._try_recover() too. 3. Most importantly, makes it much easier to create and use different types of local replay buffers for all our agents. E.g., allow us to easily create a reservoir sampling replay buffer for APPO agent for Riot in the near future. * Introduce explicit configuration for replay buffer types. * Fix is_training key error. * actually deprecate buffer_size field.		2021-10-26 20:56:02 +02:00
..
__init__.py	[RLlib] Move all jenkins RLlib-tests into bazel (rllib/BUILD). (#7178 )	2020-02-15 14:50:44 -08:00
alpha_zero_policy.py	[RLlib] Discussion 3644: Fix bug for complex obs spaces containing `Box([2D shape])` and discrete component. (#18917 )	2021-09-30 16:39:38 +02:00
alpha_zero_trainer.py	[RLlib] Unify the way we create local replay buffer for all agents (#19627 )	2021-10-26 20:56:02 +02:00
mcts.py	Remove (object) from class declarations. (#6658 )	2020-01-02 17:42:13 -08:00
ranked_rewards.py	AlphaZero and Ranked reward implementation (#6385 )	2019-12-07 12:08:40 -08:00