ray/rllib/tuned_examples at 5134e0dc12eefd7b70352bff834a0959d9a24a25 - hiro/ray

hiro/ray

mirror of https://github.com/vale981/ray synced 2025-03-06 10:31:39 -05:00

History

Sven Mika b1cda46681 [RLlib] SlateQ (tf GPU + multi-GPU) + Bandit fixes (#23276 )		2022-03-18 13:45:16 +01:00
..
a3c	[RLlib] Move existing fake multi-GPU learning tests into separate buildkite job. (#18065 )	2021-08-31 14:56:53 +02:00
alpha_star	Revert "Revert "[RLlib] AlphaStar: Parallelized, multi-agent/multi-GPU learni…" (#22153 )	2022-02-08 16:43:00 +01:00
ars	[RLlib] Make sure torch and tf behave the same wrt conv2d nets. (#8785 )	2020-06-20 00:05:19 +02:00
bandits	[RLlib] SlateQ (tf GPU + multi-GPU) + Bandit fixes (#23276 )	2022-03-18 13:45:16 +01:00
cql	[RLlib] Report total_train_steps correctly for offline agents like CQL. (#20541 )	2021-11-22 21:46:45 +01:00
ddpg	[RLlib] [CI] Deflake longer running RLlib learning tests for off policy algorithms. Fix seeding issue in TransformedAction Environments (#21685 )	2022-02-04 14:59:56 +01:00
dqn	[RLlib] Preparatory PR for multi-agent multi-GPU learner (alpha-star style) #03 (#21652 )	2022-01-25 14:16:58 +01:00
dreamer	[RLlib] Dreamer (#10172 )	2020-08-26 13:24:05 +02:00
es	[RLlib] 2 RLlib Flaky Tests (#14930 )	2021-03-30 19:21:13 +02:00
impala	[RLlib] Upgrade gym version to 0.21 and deprecate pendulum-v0. (#19535 )	2021-11-03 16:24:00 +01:00
maml	[RLLib] MAML extension for all models except RNNs (#11337 )	2020-11-12 16:51:40 -08:00
marwil	[RLlib] Dataset Reader/Writer for RLlib (#21808 )	2022-01-26 16:00:46 +01:00
mbmpo	MBMPO Cartpole (#11832 )	2020-11-12 10:30:41 -08:00
pg	[RLlib] Issue 18499: PGTrainer with training_iteration fn does not support multi-GPU. (#21376 )	2022-01-05 18:22:33 +01:00
ppo	[RLlib] Slate-Q tf implementation and tests/benchmarks. (#22389 )	2022-02-22 09:36:44 +01:00
qmix	[RLlib] QMIX better defaults + added to CI learning tests (#21332 )	2022-01-04 08:54:41 +01:00
sac	[RLlib] [CI] Deflake longer running RLlib learning tests for off policy algorithms. Fix seeding issue in TransformedAction Environments (#21685 )	2022-02-04 14:59:56 +01:00
slateq	[RLlib] SlateQ (tf GPU + multi-GPU) + Bandit fixes (#23276 )	2022-03-18 13:45:16 +01:00
cleanup_experiment.py	[CI] Format Python code with Black (#21975 )	2022-01-29 18:41:57 -08:00
compact-regression-test.yaml	[RLlib] Deprecate `vf_share_layers` in top-level PPO/MAML/MB-MPO configs. (#13397 )	2021-01-19 09:51:35 +01:00
create_plots.py	[RLlib] Benchmark and regression test yaml cleanup and restructuring. (#8414 )	2020-05-26 11:10:27 +02:00