ray/rllib/agents
2021-09-21 22:00:14 +02:00
..
a3c [RLlib] Move existing fake multi-GPU learning tests into separate buildkite job. (#18065) 2021-08-31 14:56:53 +02:00
ars [RLlib] DDPPO fixes and benchmarks. (#18390) 2021-09-08 19:39:01 +02:00
cql [RLlib Testing] Lower --smoke-test "time_total_s" to make sure it doesn't time out. (#18670) 2021-09-16 18:22:23 +02:00
ddpg [RLlib] Move existing fake multi-GPU learning tests into separate buildkite job. (#18065) 2021-08-31 14:56:53 +02:00
dqn [RLlib] Bump tf version in ML docker to tf==2.5.0; add tfp to ML-docker. (#18544) 2021-09-15 08:46:37 +02:00
dreamer [RLlib] Dreamer fixes and reinstate Dreamer test. (#17821) 2021-08-18 18:47:08 +02:00
es [RLlib] DDPPO fixes and benchmarks. (#18390) 2021-09-08 19:39:01 +02:00
impala [RLlib] POC: Separate losses for APPO/IMPALA. Enable TFPolicy to handle multiple optimizers/losses (like TorchPolicy). (#18669) 2021-09-21 22:00:14 +02:00
maml [RLlib] CQL TensorFlow support (#15841) 2021-05-18 11:10:46 +02:00
marwil [RLlib] MARWIL + BC: Various fixes and enhancements. (#16218) 2021-06-03 22:29:00 +02:00
mbmpo [RLlib] Multi-GPU for tf-DQN/PG/A2C. (#13393) 2021-03-08 15:41:27 +01:00
pg [RLlib] Fix Atari learning test regressions (2 bugs) and 1 minor attention net bug. (#18306) 2021-09-03 13:29:57 +02:00
ppo [RLlib] POC: Separate losses for APPO/IMPALA. Enable TFPolicy to handle multiple optimizers/losses (like TorchPolicy). (#18669) 2021-09-21 22:00:14 +02:00
qmix [RLlib] CQL BC loss fixes; PPO/PG/A2|3C action normalization fixes (#16531) 2021-06-30 12:32:11 +02:00
sac [RLlib Testing] Lower --smoke-test "time_total_s" to make sure it doesn't time out. (#18670) 2021-09-16 18:22:23 +02:00
slateq [RLlib] Multi-GPU for tf-DQN/PG/A2C. (#13393) 2021-03-08 15:41:27 +01:00
tests [RLlib] Add worker arg (optional) to policy_mapping_fn. (#18184) 2021-09-17 12:07:11 +02:00
__init__.py [RLlib] Fixing Memory Leak In Multi-Agent environments. Adding tooling for finding memory leaks in workers. (#15815) 2021-05-18 13:23:00 +02:00
callbacks.py [RLlib] Add policies arg to callback: on_episode_step (already exists in all other episode-related callbacks) (#18119) 2021-08-27 16:12:19 +02:00
mock.py [Testing] Split RLlib example scripts CI tests into 4 jobs (from 2). (#17331) 2021-07-26 10:52:55 -04:00
registry.py [RLlib] Add @Deprecated decorator to simplify/unify deprecation of classes, methods, functions. (#17530) 2021-08-03 18:30:02 -04:00
trainer.py [RLlib] POC: Separate losses for APPO/IMPALA. Enable TFPolicy to handle multiple optimizers/losses (like TorchPolicy). (#18669) 2021-09-21 22:00:14 +02:00
trainer_template.py [RLlib] Add support for evaluation_num_episodes=auto (run eval for as long as the parallel train step takes). (#18380) 2021-09-07 08:08:37 +02:00