Commit graph

10 commits

Author SHA1 Message Date
Sven Mika
1bc6419e0e
[RLlib] R2D2 training iteration fn AND switch off execution_plan API by default. (#24165) 2022-05-03 07:59:26 +02:00
Sven Mika
92781c603e
[RLlib] A2C training_iteration method implementation (_disable_execution_plan_api=True) (#23735) 2022-04-15 18:36:13 +02:00
Steven Morad
00922817b6
[RLlib] Rewrite PPO to use training_iteration + enable DD-PPO for Win32. (#23673) 2022-04-11 08:39:10 +02:00
Sven Mika
434265edd0
[RLlib] Examples folder: All training_iteration translations. (#23712) 2022-04-05 16:33:50 +02:00
Sven Mika
7cb86acce2
[RLlib] trainer_template.py: hard deprecation (error when used). (#23488) 2022-03-25 18:25:51 +01:00
Balaji Veeramani
7f1bacc7dc
[CI] Format Python code with Black (#21975)
See #21316 and #21311 for the motivation behind these changes.
2022-01-29 18:41:57 -08:00
Sven Mika
371fbb17e4
[RLlib] Make policies_to_train more flexible via callable option. (#20735) 2022-01-27 12:17:34 +01:00
gjoliver
99a0088233
[RLlib] Unify the way we create local replay buffer for all agents (#19627)
* [RLlib] Unify the way we create and use LocalReplayBuffer for all the agents.

This change
1. Get rid of the try...except clause when we call execution_plan(),
   and get rid of the Deprecation warning as a result.
2. Fix the execution_plan() call in Trainer._try_recover() too.
3. Most importantly, makes it much easier to create and use different types
   of local replay buffers for all our agents.
   E.g., allow us to easily create a reservoir sampling replay buffer for
   APPO agent for Riot in the near future.
* Introduce explicit configuration for replay buffer types.
* Fix is_training key error.
* actually deprecate buffer_size field.
2021-10-26 20:56:02 +02:00
Richard Liaw
a78a2263e5
[RLlib] Fix reverted RockPaperScissors Pettingzoo example (#16896) 2021-07-22 10:55:07 -04:00
Pierre TASSEL
66605cfcbd
[RLLib] Random Parametric Trainer (#11366) 2020-11-04 11:12:51 +01:00