ray/rllib/algorithms at f421730b4796ad3871961b573c0cba106f8a21ee - hiro/ray

hiro/ray

mirror of https://github.com/vale981/ray synced 2025-03-05 10:01:43 -05:00

History

kourosh hakhamaneshi f421730b47 [RLlib] Added `expectation` advantage_type option to CRR. (#26142 )		2022-06-28 15:40:09 +02:00
..
a2c	[RLlib] Cleanup some deprecated metric keys and classes. (#26036 )	2022-06-23 21:30:01 +02:00
a3c	[RLlib] Algorithm `step()` fixes: evaluation should NOT be part of timed `training_step` loop. (#25924 )	2022-06-20 19:53:47 +02:00
alpha_star	[tune/structure] Introduce execution package (#26015 )	2022-06-23 11:13:19 +01:00
alpha_zero	[RLlib] Make QMix use the ReplayBufferAPI (#25560 )	2022-06-23 22:55:22 -07:00
apex_ddpg	[RLlib] Algorithm `step()` fixes: evaluation should NOT be part of timed `training_step` loop. (#25924 )	2022-06-20 19:53:47 +02:00
apex_dqn	[RLlib] IMPALA and APPO metrics fixes; remove deprecated `async_parallel_requests` utility. (#26117 )	2022-06-28 15:14:37 +02:00
appo	[RLlib] Aggregate Impala learner info. (#25856 )	2022-06-22 09:43:10 +02:00
ars	[RLlib] More Trainer -> Algorithm renaming cleanups. (#25869 )	2022-06-20 15:54:00 +02:00
bandit	[RLlib] `Trainer` to `Algorithm` renaming. (#25539 )	2022-06-11 15:10:39 +02:00
bc	[RLlib] `Trainer` to `Algorithm` renaming. (#25539 )	2022-06-11 15:10:39 +02:00
cql	[RLlib] Move offline input into replay buffer using rollout ops in CQL. (#25629 )	2022-06-17 17:08:55 +02:00
crr	[RLlib] Added `expectation` advantage_type option to CRR. (#26142 )	2022-06-28 15:40:09 +02:00
ddpg	[RLlib] Fix DDPG test ignoring `framework_iterator`-modified config. (#25913 )	2022-06-21 16:17:42 +02:00
ddppo	[RLlib] Cleanup some deprecated metric keys and classes. (#26036 )	2022-06-23 21:30:01 +02:00
dqn	[RLlib] Cleanup some deprecated metric keys and classes. (#26036 )	2022-06-23 21:30:01 +02:00
dreamer	[RLlib] Cleanup some deprecated metric keys and classes. (#26036 )	2022-06-23 21:30:01 +02:00
es	[RLlib] More Trainer -> Algorithm renaming cleanups. (#25869 )	2022-06-20 15:54:00 +02:00
impala	[RLlib] IMPALA and APPO metrics fixes; remove deprecated `async_parallel_requests` utility. (#26117 )	2022-06-28 15:14:37 +02:00
maddpg	Revert "[RLlib] Remove execution plan code no longer used by RLlib. (#25624 )" (#25776 )	2022-06-14 13:59:15 -07:00
maml	[RLlib] `Trainer` to `Algorithm` renaming. (#25539 )	2022-06-11 15:10:39 +02:00
marwil	[RLlib] Cleanup some deprecated metric keys and classes. (#26036 )	2022-06-23 21:30:01 +02:00
mbmpo	[RLlib] `Trainer` to `Algorithm` renaming. (#25539 )	2022-06-11 15:10:39 +02:00
pg	Revert "[RLlib] Remove execution plan code no longer used by RLlib. (#25624 )" (#25776 )	2022-06-14 13:59:15 -07:00
ppo	[RLlib] Cleanup some deprecated metric keys and classes. (#26036 )	2022-06-23 21:30:01 +02:00
qmix	[RLlib] Make QMix use the ReplayBufferAPI (#25560 )	2022-06-23 22:55:22 -07:00
r2d2	[RLlib] More Trainer -> Algorithm renaming cleanups. (#25869 )	2022-06-20 15:54:00 +02:00
sac	[RLlib] More Trainer -> Algorithm renaming cleanups. (#25869 )	2022-06-20 15:54:00 +02:00
simple_q	[RLlib] SimpleQ PyTorch Multi GPU fix (#26109 )	2022-06-28 12:12:56 +02:00
slateq	[RLlib] `Trainer` to `Algorithm` renaming. (#25539 )	2022-06-11 15:10:39 +02:00
td3	[RLlib] More Trainer -> Algorithm renaming cleanups. (#25869 )	2022-06-20 15:54:00 +02:00
tests	[RLlib] Algorithm `step()` fixes: evaluation should NOT be part of timed `training_step` loop. (#25924 )	2022-06-20 19:53:47 +02:00
__init__.py	[RLlib] `Trainer` to `Algorithm` renaming. (#25539 )	2022-06-11 15:10:39 +02:00
algorithm.py	[RLlib] IMPALA and APPO metrics fixes; remove deprecated `async_parallel_requests` utility. (#26117 )	2022-06-28 15:14:37 +02:00
algorithm_config.py	[RLlib] Add timeout to filter synchronization. (#25959 )	2022-06-24 14:37:43 +02:00
callbacks.py	[RLlib] IMPALA and APPO metrics fixes; remove deprecated `async_parallel_requests` utility. (#26117 )	2022-06-28 15:14:37 +02:00
mock.py	[RLlib] `Trainer` to `Algorithm` renaming. (#25539 )	2022-06-11 15:10:39 +02:00
registry.py	[RLlib] Fixes logging of all of RLlib's Algorithm names as warning messages. (#25840 )	2022-06-17 08:41:18 +02:00