..
a2c
[RLlib] Revert 41c9ef70
. ( #27243 )
2022-07-29 11:05:15 -07:00
a3c
[RLlib] A2C + A3C move to algorithms
folder and re-name into A2C/A3C (from ...Trainer). ( #25314 )
2022-06-01 09:29:16 +02:00
alpha_star
[RLlib] Retry agents -> algorithms. with proper doc changes this time. ( #24797 )
2022-05-16 09:45:32 +02:00
alpha_zero
[RLlib] AlphaZero uses training_iteration API. ( #24507 )
2022-05-18 09:58:25 +02:00
apex_ddpg
[RLlib] Deprecation: Replace remaining evaluation_num_episodes
with evaluation_duration
. ( #26000 )
2022-06-23 19:11:29 +02:00
apex_dqn
[rllib] Use compress observations where replay buffers and image obs are used in tuned examples ( #26735 )
2022-07-22 10:10:51 -07:00
appo
[RLlib] Implemented ViewRequirementConnector ( #26998 )
2022-07-26 21:52:14 -07:00
ars
[RLlib] Make sure torch and tf behave the same wrt conv2d nets. ( #8785 )
2020-06-20 00:05:19 +02:00
bandits
[RLlib] SlateQ (tf GPU + multi-GPU) + Bandit fixes ( #23276 )
2022-03-18 13:45:16 +01:00
bc
[RLlib] Move all remaining algos into algorithms
directory. ( #25366 )
2022-06-04 07:35:24 +02:00
cql
[RLlib] Make JSONReader default, users will have to use the DatasetReader for any speedups. ( #26541 )
2022-07-14 17:19:38 +02:00
crr
[RLlib] Fixes CRR flakeyness ( #26770 )
2022-07-20 12:08:57 -07:00
ddpg
[RLlib] Deprecation: Replace remaining evaluation_num_episodes
with evaluation_duration
. ( #26000 )
2022-06-23 19:11:29 +02:00
ddppo
[RLlib] Move all remaining algos into algorithms
directory. ( #25366 )
2022-06-04 07:35:24 +02:00
dqn
[rllib] Use compress observations where replay buffers and image obs are used in tuned examples ( #26735 )
2022-07-22 10:10:51 -07:00
dreamer
[RLlib] Dreamer ( #10172 )
2020-08-26 13:24:05 +02:00
es
[RLlib] 2 RLlib Flaky Tests ( #14930 )
2021-03-30 19:21:13 +02:00
impala
[RLlib] Move IMPALA and APPO back to exec plan (for now; due to unresolved learning/performance issues). ( #25851 )
2022-06-29 08:41:47 +02:00
maddpg
[RLlib] MADDPG: Move into main algorithms
folder and add proper unit and learning tests. ( #24579 )
2022-05-24 12:53:53 +02:00
maml
[RLLib] MAML extension for all models except RNNs ( #11337 )
2020-11-12 16:51:40 -08:00
marwil
[RLlib] Move all remaining algos into algorithms
directory. ( #25366 )
2022-06-04 07:35:24 +02:00
mbmpo
MBMPO Cartpole ( #11832 )
2020-11-12 10:30:41 -08:00
pg
[RLlib] Deflake cartpole crashing tests. ( #27097 )
2022-07-27 12:50:34 -07:00
ppo
[RLlib] Implemented ViewRequirementConnector ( #26998 )
2022-07-26 21:52:14 -07:00
qmix
[RLlib] QMIX better defaults + added to CI learning tests ( #21332 )
2022-01-04 08:54:41 +01:00
r2d2
[RLlib] Better default values for training_intensity
and target_network_update_freq
for R2D2. ( #25510 )
2022-06-07 10:29:56 +02:00
sac
[RLlib] Trainer.training_iteration -> Trainer.training_step; Iterations vs reportings: Clarification of terms. ( #25076 )
2022-06-10 17:09:18 +02:00
simple_q
[RLlib] Move all remaining algos into algorithms
directory. ( #25366 )
2022-06-04 07:35:24 +02:00
slateq
[RLlib] SlateQ (tf GPU + multi-GPU) + Bandit fixes ( #23276 )
2022-03-18 13:45:16 +01:00
td3
[RLlib] Deprecation: Replace remaining evaluation_num_episodes
with evaluation_duration
. ( #26000 )
2022-06-23 19:11:29 +02:00
cleanup_experiment.py
[CI] Format Python code with Black ( #21975 )
2022-01-29 18:41:57 -08:00
compact-regression-test.yaml
[RLlib] Trainer.training_iteration -> Trainer.training_step; Iterations vs reportings: Clarification of terms. ( #25076 )
2022-06-10 17:09:18 +02:00
create_plots.py
[RLlib] Benchmark and regression test yaml cleanup and restructuring. ( #8414 )
2020-05-26 11:10:27 +02:00