ray/rllib/tuned_examples
2022-08-09 16:54:03 +02:00
..
a2c [RLlib] Revert 41c9ef70. (#27243) 2022-07-29 11:05:15 -07:00
a3c [RLlib] A2C + A3C move to algorithms folder and re-name into A2C/A3C (from ...Trainer). (#25314) 2022-06-01 09:29:16 +02:00
alpha_star [RLlib] Retry agents -> algorithms. with proper doc changes this time. (#24797) 2022-05-16 09:45:32 +02:00
alpha_zero [RLlib] AlphaZero uses training_iteration API. (#24507) 2022-05-18 09:58:25 +02:00
apex_ddpg [RLlib] Deprecation: Replace remaining evaluation_num_episodes with evaluation_duration. (#26000) 2022-06-23 19:11:29 +02:00
apex_dqn [rllib] Use compress observations where replay buffers and image obs are used in tuned examples (#26735) 2022-07-22 10:10:51 -07:00
appo [RLlib] Implemented ViewRequirementConnector (#26998) 2022-07-26 21:52:14 -07:00
ars [RLlib] Make sure torch and tf behave the same wrt conv2d nets. (#8785) 2020-06-20 00:05:19 +02:00
bandits [RLlib] SlateQ (tf GPU + multi-GPU) + Bandit fixes (#23276) 2022-03-18 13:45:16 +01:00
bc [RLlib] Move all remaining algos into algorithms directory. (#25366) 2022-06-04 07:35:24 +02:00
cql [RLlib] Make JSONReader default, users will have to use the DatasetReader for any speedups. (#26541) 2022-07-14 17:19:38 +02:00
crr [RLlib] DatasetReader action normalization. (#27356) 2022-08-09 16:54:03 +02:00
ddpg [RLlib] Deprecation: Replace remaining evaluation_num_episodes with evaluation_duration. (#26000) 2022-06-23 19:11:29 +02:00
ddppo [RLlib] Move all remaining algos into algorithms directory. (#25366) 2022-06-04 07:35:24 +02:00
dqn [rllib] Use compress observations where replay buffers and image obs are used in tuned examples (#26735) 2022-07-22 10:10:51 -07:00
dreamer [RLlib] Dreamer (#10172) 2020-08-26 13:24:05 +02:00
es [RLlib] 2 RLlib Flaky Tests (#14930) 2021-03-30 19:21:13 +02:00
impala [RLlib] Move IMPALA and APPO back to exec plan (for now; due to unresolved learning/performance issues). (#25851) 2022-06-29 08:41:47 +02:00
maddpg [RLlib] MADDPG: Move into main algorithms folder and add proper unit and learning tests. (#24579) 2022-05-24 12:53:53 +02:00
maml [RLLib] MAML extension for all models except RNNs (#11337) 2020-11-12 16:51:40 -08:00
marwil [RLlib] Move all remaining algos into algorithms directory. (#25366) 2022-06-04 07:35:24 +02:00
mbmpo MBMPO Cartpole (#11832) 2020-11-12 10:30:41 -08:00
pg [RLlib] Fix the last cartpole-crashing premerge test. (#27315) 2022-08-02 20:08:33 +02:00
ppo [RLlib] Implemented ViewRequirementConnector (#26998) 2022-07-26 21:52:14 -07:00
qmix [RLlib] QMIX better defaults + added to CI learning tests (#21332) 2022-01-04 08:54:41 +01:00
r2d2 [RLlib] Better default values for training_intensity and target_network_update_freq for R2D2. (#25510) 2022-06-07 10:29:56 +02:00
sac [RLlib] Trainer.training_iteration -> Trainer.training_step; Iterations vs reportings: Clarification of terms. (#25076) 2022-06-10 17:09:18 +02:00
simple_q [RLlib] Move all remaining algos into algorithms directory. (#25366) 2022-06-04 07:35:24 +02:00
slateq [RLlib] SlateQ (tf GPU + multi-GPU) + Bandit fixes (#23276) 2022-03-18 13:45:16 +01:00
td3 [RLlib] Deprecation: Replace remaining evaluation_num_episodes with evaluation_duration. (#26000) 2022-06-23 19:11:29 +02:00
cleanup_experiment.py [CI] Format Python code with Black (#21975) 2022-01-29 18:41:57 -08:00
compact-regression-test.yaml [RLlib] Trainer.training_iteration -> Trainer.training_step; Iterations vs reportings: Clarification of terms. (#25076) 2022-06-10 17:09:18 +02:00
create_plots.py [RLlib] Benchmark and regression test yaml cleanup and restructuring. (#8414) 2020-05-26 11:10:27 +02:00