..
a2c
[RLlib] Revert 41c9ef70
. ( #27243 )
2022-07-29 11:05:15 -07:00
a3c
[RLlib] Revert 41c9ef70
. ( #27243 )
2022-07-29 11:05:15 -07:00
alpha_star
[RLlib] Move IMPALA and APPO back to exec plan (for now; due to unresolved learning/performance issues). ( #25851 )
2022-06-29 08:41:47 +02:00
alpha_zero
[RLlib] Get rid of all these deprecation warnings. ( #27085 )
2022-07-27 10:48:54 -07:00
apex_ddpg
[RLlib] Algorithm step()
fixes: evaluation should NOT be part of timed training_step
loop. ( #25924 )
2022-06-20 19:53:47 +02:00
apex_dqn
[tune/rllib] Hotfix ml_utils deprecation import error ( #27095 )
2022-07-27 16:11:58 +01:00
appo
[RLlib] Unify gnorm mixin for tf and torch policies. ( #26102 )
2022-07-24 15:31:09 +02:00
ars
[RLlib] More Trainer -> Algorithm renaming cleanups. ( #25869 )
2022-06-20 15:54:00 +02:00
bandit
[RLlib] Trainer
to Algorithm
renaming. ( #25539 )
2022-06-11 15:10:39 +02:00
bc
[RLlib]: Raise deprecation warning in MARWIL OPE methods. ( #26893 )
2022-07-23 13:55:40 +02:00
cql
[RLlib]: Move OPE to evaluation config ( #25911 )
2022-07-12 11:04:34 -07:00
crr
[RLlib] Fixes CRR flakeyness ( #26770 )
2022-07-20 12:08:57 -07:00
ddpg
[RLlib] Fix a bunch of issues related to connectors. ( #26510 )
2022-07-13 18:55:20 +02:00
ddppo
[RLlib] Cleanup some deprecated metric keys and classes. ( #26036 )
2022-06-23 21:30:01 +02:00
dqn
[RLlib] Fix memory leak in APEX_DQN ( #26691 )
2022-07-19 16:16:24 -07:00
dreamer
[RLlib] Get rid of all these deprecation warnings. ( #27085 )
2022-07-27 10:48:54 -07:00
es
[RLlib] More Trainer -> Algorithm renaming cleanups. ( #25869 )
2022-06-20 15:54:00 +02:00
impala
[RLlib] Get rid of all these deprecation warnings. ( #27085 )
2022-07-27 10:48:54 -07:00
maddpg
[RLlib] Save serialized PolicySpec. Extract num_gpus
related logics into a util function. ( #25954 )
2022-06-30 11:38:21 +02:00
maml
[RLlib] Get rid of all these deprecation warnings. ( #27085 )
2022-07-27 10:48:54 -07:00
marwil
[RLlib] Get rid of all these deprecation warnings. ( #27085 )
2022-07-27 10:48:54 -07:00
mbmpo
[RLlib] Get rid of all these deprecation warnings. ( #27085 )
2022-07-27 10:48:54 -07:00
pg
[RLlib] Fix a bunch of issues related to connectors. ( #26510 )
2022-07-13 18:55:20 +02:00
ppo
[RLlib] Unify gnorm mixin for tf and torch policies. ( #26102 )
2022-07-24 15:31:09 +02:00
qmix
[RLlib] Make QMix use the ReplayBufferAPI ( #25560 )
2022-06-23 22:55:22 -07:00
r2d2
[RLlib] More Trainer -> Algorithm renaming cleanups. ( #25869 )
2022-06-20 15:54:00 +02:00
sac
[RLlib] Migrating DDPG to PolicyV2. ( #26054 )
2022-06-28 15:52:56 +02:00
simple_q
[RLlib] Fix a bunch of issues related to connectors. ( #26510 )
2022-07-13 18:55:20 +02:00
slateq
[RLlib] Trainer
to Algorithm
renaming. ( #25539 )
2022-06-11 15:10:39 +02:00
td3
[RLlib] More Trainer -> Algorithm renaming cleanups. ( #25869 )
2022-06-20 15:54:00 +02:00
tests
[RLlib] Beef up worker failure test. ( #26953 )
2022-07-27 00:10:45 -07:00
__init__.py
[RLlib] Trainer
to Algorithm
renaming. ( #25539 )
2022-06-11 15:10:39 +02:00
algorithm.py
[rfc] [tune/rllib] Fetch _progress_metrics from trainable for verbose=2 display ( #26967 )
2022-07-27 16:04:23 +01:00
algorithm_config.py
[RLlib] Beef up worker failure test. ( #26953 )
2022-07-27 00:10:45 -07:00
callbacks.py
[RLlib] more connector polishes and fixes. ( #26645 )
2022-07-19 08:50:28 -07:00
mock.py
[RLlib] Trainer
to Algorithm
renaming. ( #25539 )
2022-06-11 15:10:39 +02:00
registry.py
[RLlib] Try to checkpoint a durable policy name ( #27016 )
2022-07-27 00:01:14 -07:00