ray/rllib/tuned_examples at 68a9a33386a5b2d0961f228666b177052a2d406a - hiro/ray

hiro/ray

mirror of https://github.com/vale981/ray synced 2025-03-05 10:01:43 -05:00

History

Jun Gong 68a9a33386 [RLlib] Retry agents -> algorithms. with proper doc changes this time. (#24797 )		2022-05-16 09:45:32 +02:00
..
a3c	[RLlib] A2C `training_iteration` method implementation (`_disable_execution_plan_api=True`) (#23735 )	2022-04-15 18:36:13 +02:00
alpha_star	[RLlib] Retry agents -> algorithms. with proper doc changes this time. (#24797 )	2022-05-16 09:45:32 +02:00
ars	[RLlib] Make sure torch and tf behave the same wrt conv2d nets. (#8785 )	2020-06-20 00:05:19 +02:00
bandits	[RLlib] SlateQ (tf GPU + multi-GPU) + Bandit fixes (#23276 )	2022-03-18 13:45:16 +01:00
cql	[RLlib] SAC with new Replay Buffer API. (#24156 )	2022-05-09 14:33:02 +02:00
ddpg	[RLlib] DDPG Training iteration fn & Replay Buffer API (#24212 )	2022-05-05 09:41:38 +02:00
dqn	[RLlib] R2D2 Replay Buffer API integration. (#24473 )	2022-05-10 20:36:14 +02:00
dreamer	[RLlib] Dreamer (#10172 )	2020-08-26 13:24:05 +02:00
es	[RLlib] 2 RLlib Flaky Tests (#14930 )	2021-03-30 19:21:13 +02:00
impala	[RLlib] Memory leak finding toolset using tracemalloc + CI memory leak tests. (#15412 )	2022-04-12 07:50:09 +02:00
maddpg	[RLlib] MADDPG: Move into agents folder (from contrib) and use `training_iteration` method. (#24502 )	2022-05-06 12:35:21 +02:00
maml	[RLLib] MAML extension for all models except RNNs (#11337 )	2020-11-12 16:51:40 -08:00
marwil	[rllib] Fix error messages and example for dataset writer (#23419 )	2022-03-28 19:53:12 +01:00
mbmpo	MBMPO Cartpole (#11832 )	2020-11-12 10:30:41 -08:00
pg	[RLlib] Issue 18499: PGTrainer with training_iteration fn does not support multi-GPU. (#21376 )	2022-01-05 18:22:33 +01:00
ppo	[RLlib] DD-PPO training iteration fn. (#24118 )	2022-04-22 15:22:14 -07:00
qmix	[RLlib] QMIX better defaults + added to CI learning tests (#21332 )	2022-01-04 08:54:41 +01:00
sac	[RLlib] SAC with new Replay Buffer API. (#24156 )	2022-05-09 14:33:02 +02:00
slateq	[RLlib] SlateQ (tf GPU + multi-GPU) + Bandit fixes (#23276 )	2022-03-18 13:45:16 +01:00
cleanup_experiment.py	[CI] Format Python code with Black (#21975 )	2022-01-29 18:41:57 -08:00
compact-regression-test.yaml	[RLlib] Deprecate `timesteps_per_iteration` config key (in favor of `min_[sample\|train]_timesteps_per_reporting`. (#24372 )	2022-05-02 12:51:14 +02:00
create_plots.py	[RLlib] Benchmark and regression test yaml cleanup and restructuring. (#8414 )	2020-05-26 11:10:27 +02:00