ray/rllib/tuned_examples at e6e10ce4cf1137676f6ad0f922e79038691a0109 - hiro/ray

hiro/ray

mirror of https://github.com/vale981/ray synced 2025-03-05 10:01:43 -05:00

History

Jun Gong e6e10ce4cf [RLlib] Revert `41c9ef70`. (#27243 ) Why are these changes needed? Also: Add validation to make sure multi-gpu and micro-batch is not used together. Update A2C learning test to hit the microbatching branch. Minor comment updates.		2022-07-29 11:05:15 -07:00
..
a2c	[RLlib] Revert `41c9ef70`. (#27243 )	2022-07-29 11:05:15 -07:00
a3c	[RLlib] A2C + A3C move to `algorithms` folder and re-name into A2C/A3C (from ...Trainer). (#25314 )	2022-06-01 09:29:16 +02:00
alpha_star	[RLlib] Retry agents -> algorithms. with proper doc changes this time. (#24797 )	2022-05-16 09:45:32 +02:00
alpha_zero	[RLlib] AlphaZero uses training_iteration API. (#24507 )	2022-05-18 09:58:25 +02:00
apex_ddpg	[RLlib] Deprecation: Replace remaining `evaluation_num_episodes` with `evaluation_duration`. (#26000 )	2022-06-23 19:11:29 +02:00
apex_dqn	[rllib] Use compress observations where replay buffers and image obs are used in tuned examples (#26735 )	2022-07-22 10:10:51 -07:00
appo	[RLlib] Implemented ViewRequirementConnector (#26998 )	2022-07-26 21:52:14 -07:00
ars	[RLlib] Make sure torch and tf behave the same wrt conv2d nets. (#8785 )	2020-06-20 00:05:19 +02:00
bandits	[RLlib] SlateQ (tf GPU + multi-GPU) + Bandit fixes (#23276 )	2022-03-18 13:45:16 +01:00
bc	[RLlib] Move all remaining algos into `algorithms` directory. (#25366 )	2022-06-04 07:35:24 +02:00
cql	[RLlib] Make JSONReader default, users will have to use the DatasetReader for any speedups. (#26541 )	2022-07-14 17:19:38 +02:00
crr	[RLlib] Fixes CRR flakeyness (#26770 )	2022-07-20 12:08:57 -07:00
ddpg	[RLlib] Deprecation: Replace remaining `evaluation_num_episodes` with `evaluation_duration`. (#26000 )	2022-06-23 19:11:29 +02:00
ddppo	[RLlib] Move all remaining algos into `algorithms` directory. (#25366 )	2022-06-04 07:35:24 +02:00
dqn	[rllib] Use compress observations where replay buffers and image obs are used in tuned examples (#26735 )	2022-07-22 10:10:51 -07:00
dreamer	[RLlib] Dreamer (#10172 )	2020-08-26 13:24:05 +02:00
es	[RLlib] 2 RLlib Flaky Tests (#14930 )	2021-03-30 19:21:13 +02:00
impala	[RLlib] Move IMPALA and APPO back to exec plan (for now; due to unresolved learning/performance issues). (#25851 )	2022-06-29 08:41:47 +02:00
maddpg	[RLlib] MADDPG: Move into main `algorithms` folder and add proper unit and learning tests. (#24579 )	2022-05-24 12:53:53 +02:00
maml	[RLLib] MAML extension for all models except RNNs (#11337 )	2020-11-12 16:51:40 -08:00
marwil	[RLlib] Move all remaining algos into `algorithms` directory. (#25366 )	2022-06-04 07:35:24 +02:00
mbmpo	MBMPO Cartpole (#11832 )	2020-11-12 10:30:41 -08:00
pg	[RLlib] Deflake cartpole crashing tests. (#27097 )	2022-07-27 12:50:34 -07:00
ppo	[RLlib] Implemented ViewRequirementConnector (#26998 )	2022-07-26 21:52:14 -07:00
qmix	[RLlib] QMIX better defaults + added to CI learning tests (#21332 )	2022-01-04 08:54:41 +01:00
r2d2	[RLlib] Better default values for `training_intensity` and `target_network_update_freq` for R2D2. (#25510 )	2022-06-07 10:29:56 +02:00
sac	[RLlib] Trainer.training_iteration -> Trainer.training_step; Iterations vs reportings: Clarification of terms. (#25076 )	2022-06-10 17:09:18 +02:00
simple_q	[RLlib] Move all remaining algos into `algorithms` directory. (#25366 )	2022-06-04 07:35:24 +02:00
slateq	[RLlib] SlateQ (tf GPU + multi-GPU) + Bandit fixes (#23276 )	2022-03-18 13:45:16 +01:00
td3	[RLlib] Deprecation: Replace remaining `evaluation_num_episodes` with `evaluation_duration`. (#26000 )	2022-06-23 19:11:29 +02:00
cleanup_experiment.py	[CI] Format Python code with Black (#21975 )	2022-01-29 18:41:57 -08:00
compact-regression-test.yaml	[RLlib] Trainer.training_iteration -> Trainer.training_step; Iterations vs reportings: Clarification of terms. (#25076 )	2022-06-10 17:09:18 +02:00
create_plots.py	[RLlib] Benchmark and regression test yaml cleanup and restructuring. (#8414 )	2020-05-26 11:10:27 +02:00