ray/rllib/tuned_examples at a337fd994e1e7bec8c8f699f75630bcab8df8948 - hiro/ray

hiro/ray

mirror of https://github.com/vale981/ray synced 2025-03-05 10:01:43 -05:00

History

Avnish Narayan a337fd994e Revert revert #23906 [RLlib] DD-PPO training iteration function implementation. (#24035 )		2022-04-21 17:37:49 +02:00
..
a3c	[RLlib] A2C `training_iteration` method implementation (`_disable_execution_plan_api=True`) (#23735 )	2022-04-15 18:36:13 +02:00
alpha_star	Revert "Revert "[RLlib] AlphaStar: Parallelized, multi-agent/multi-GPU learni…" (#22153 )	2022-02-08 16:43:00 +01:00
ars	[RLlib] Make sure torch and tf behave the same wrt conv2d nets. (#8785 )	2020-06-20 00:05:19 +02:00
bandits	[RLlib] SlateQ (tf GPU + multi-GPU) + Bandit fixes (#23276 )	2022-03-18 13:45:16 +01:00
cql	[RLlib] Report total_train_steps correctly for offline agents like CQL. (#20541 )	2021-11-22 21:46:45 +01:00
ddpg	[RLlib] Memory leak finding toolset using tracemalloc + CI memory leak tests. (#15412 )	2022-04-12 07:50:09 +02:00
dqn	[RLlib][Training iteration fn] APEX conversion (#22937 )	2022-04-20 17:56:18 +02:00
dreamer	[RLlib] Dreamer (#10172 )	2020-08-26 13:24:05 +02:00
es	[RLlib] 2 RLlib Flaky Tests (#14930 )	2021-03-30 19:21:13 +02:00
impala	[RLlib] Memory leak finding toolset using tracemalloc + CI memory leak tests. (#15412 )	2022-04-12 07:50:09 +02:00
maml	[RLLib] MAML extension for all models except RNNs (#11337 )	2020-11-12 16:51:40 -08:00
marwil	[rllib] Fix error messages and example for dataset writer (#23419 )	2022-03-28 19:53:12 +01:00
mbmpo	MBMPO Cartpole (#11832 )	2020-11-12 10:30:41 -08:00
pg	[RLlib] Issue 18499: PGTrainer with training_iteration fn does not support multi-GPU. (#21376 )	2022-01-05 18:22:33 +01:00
ppo	Revert revert #23906 [RLlib] DD-PPO training iteration function implementation. (#24035 )	2022-04-21 17:37:49 +02:00
qmix	[RLlib] QMIX better defaults + added to CI learning tests (#21332 )	2022-01-04 08:54:41 +01:00
sac	[RLlib] Memory leak finding toolset using tracemalloc + CI memory leak tests. (#15412 )	2022-04-12 07:50:09 +02:00
slateq	[RLlib] SlateQ (tf GPU + multi-GPU) + Bandit fixes (#23276 )	2022-03-18 13:45:16 +01:00
cleanup_experiment.py	[CI] Format Python code with Black (#21975 )	2022-01-29 18:41:57 -08:00
compact-regression-test.yaml	[RLlib] Simple-Q uses training iteration fn (instead of execution_plan); ReplayBuffer API for Simple-Q (#22842 )	2022-03-29 14:44:40 +02:00
create_plots.py	[RLlib] Benchmark and regression test yaml cleanup and restructuring. (#8414 )	2020-05-26 11:10:27 +02:00