ray/rllib/tuned_examples
2022-06-28 15:40:09 +02:00
..
a2c [RLlib] Unflake some CI-tests. (#25313) 2022-06-03 14:51:50 +02:00
a3c [RLlib] A2C + A3C move to algorithms folder and re-name into A2C/A3C (from ...Trainer). (#25314) 2022-06-01 09:29:16 +02:00
alpha_star [RLlib] Retry agents -> algorithms. with proper doc changes this time. (#24797) 2022-05-16 09:45:32 +02:00
alpha_zero [RLlib] AlphaZero uses training_iteration API. (#24507) 2022-05-18 09:58:25 +02:00
apex_ddpg [RLlib] Deprecation: Replace remaining evaluation_num_episodes with evaluation_duration. (#26000) 2022-06-23 19:11:29 +02:00
apex_dqn [RLlib] Trainer.training_iteration -> Trainer.training_step; Iterations vs reportings: Clarification of terms. (#25076) 2022-06-10 17:09:18 +02:00
appo [RLlib] IMPALA/APPO multi-agent mix-in-buffer fixes (plus MA learning tests). (#25848) 2022-06-17 14:10:36 +02:00
ars [RLlib] Make sure torch and tf behave the same wrt conv2d nets. (#8785) 2020-06-20 00:05:19 +02:00
bandits [RLlib] SlateQ (tf GPU + multi-GPU) + Bandit fixes (#23276) 2022-03-18 13:45:16 +01:00
bc [RLlib] Move all remaining algos into algorithms directory. (#25366) 2022-06-04 07:35:24 +02:00
cql [RLlib] Deprecation: Replace remaining evaluation_num_episodes with evaluation_duration. (#26000) 2022-06-23 19:11:29 +02:00
crr [RLlib] Added expectation advantage_type option to CRR. (#26142) 2022-06-28 15:40:09 +02:00
ddpg [RLlib] Deprecation: Replace remaining evaluation_num_episodes with evaluation_duration. (#26000) 2022-06-23 19:11:29 +02:00
ddppo [RLlib] Move all remaining algos into algorithms directory. (#25366) 2022-06-04 07:35:24 +02:00
dqn [RLlib] Trainer.training_iteration -> Trainer.training_step; Iterations vs reportings: Clarification of terms. (#25076) 2022-06-10 17:09:18 +02:00
dreamer [RLlib] Dreamer (#10172) 2020-08-26 13:24:05 +02:00
es [RLlib] 2 RLlib Flaky Tests (#14930) 2021-03-30 19:21:13 +02:00
impala [RLlib] IMPALA/APPO multi-agent mix-in-buffer fixes (plus MA learning tests). (#25848) 2022-06-17 14:10:36 +02:00
maddpg [RLlib] MADDPG: Move into main algorithms folder and add proper unit and learning tests. (#24579) 2022-05-24 12:53:53 +02:00
maml [RLLib] MAML extension for all models except RNNs (#11337) 2020-11-12 16:51:40 -08:00
marwil [RLlib] Move all remaining algos into algorithms directory. (#25366) 2022-06-04 07:35:24 +02:00
mbmpo MBMPO Cartpole (#11832) 2020-11-12 10:30:41 -08:00
pg [RLlib] Algorithm step() fixes: evaluation should NOT be part of timed training_step loop. (#25924) 2022-06-20 19:53:47 +02:00
ppo [RLlib] Deprecation: Replace remaining evaluation_num_episodes with evaluation_duration. (#26000) 2022-06-23 19:11:29 +02:00
qmix [RLlib] QMIX better defaults + added to CI learning tests (#21332) 2022-01-04 08:54:41 +01:00
r2d2 [RLlib] Better default values for training_intensity and target_network_update_freq for R2D2. (#25510) 2022-06-07 10:29:56 +02:00
sac [RLlib] Trainer.training_iteration -> Trainer.training_step; Iterations vs reportings: Clarification of terms. (#25076) 2022-06-10 17:09:18 +02:00
simple_q [RLlib] Move all remaining algos into algorithms directory. (#25366) 2022-06-04 07:35:24 +02:00
slateq [RLlib] SlateQ (tf GPU + multi-GPU) + Bandit fixes (#23276) 2022-03-18 13:45:16 +01:00
td3 [RLlib] Deprecation: Replace remaining evaluation_num_episodes with evaluation_duration. (#26000) 2022-06-23 19:11:29 +02:00
cleanup_experiment.py [CI] Format Python code with Black (#21975) 2022-01-29 18:41:57 -08:00
compact-regression-test.yaml [RLlib] Trainer.training_iteration -> Trainer.training_step; Iterations vs reportings: Clarification of terms. (#25076) 2022-06-10 17:09:18 +02:00
create_plots.py [RLlib] Benchmark and regression test yaml cleanup and restructuring. (#8414) 2020-05-26 11:10:27 +02:00