ray/rllib/algorithms at 862d10c162421706f77f73428429379a8b22fc38 - hiro/ray

hiro/ray

mirror of https://github.com/vale981/ray synced 2025-03-05 10:01:43 -05:00

History

Amog Kamsetty 862d10c162 [AIR] Remove ML code from `ray.util` (#27005 ) Removes all ML related code from `ray.util` Removes: - `ray.util.xgboost` - `ray.util.lightgbm` - `ray.util.horovod` - `ray.util.ray_lightning` Moves `ray.util.ml_utils` to other locations Closes #23900 Signed-off-by: Amog Kamsetty <amogkamsetty@yahoo.com> Signed-off-by: Kai Fricke <kai@anyscale.com> Co-authored-by: Kai Fricke <kai@anyscale.com>		2022-07-27 14:24:19 +01:00
..
a2c	[RLlib] Using PG when not doing microbatching kills A2C performance. (#26844 )	2022-07-25 15:11:26 +02:00
a3c	[RLlib] Using PG when not doing microbatching kills A2C performance. (#26844 )	2022-07-25 15:11:26 +02:00
alpha_star	[RLlib] Move IMPALA and APPO back to exec plan (for now; due to unresolved learning/performance issues). (#25851 )	2022-06-29 08:41:47 +02:00
alpha_zero	[RLlib] Make QMix use the ReplayBufferAPI (#25560 )	2022-06-23 22:55:22 -07:00
apex_ddpg	[RLlib] Algorithm `step()` fixes: evaluation should NOT be part of timed `training_step` loop. (#25924 )	2022-06-20 19:53:47 +02:00
apex_dqn	[AIR] Remove ML code from `ray.util` (#27005 )	2022-07-27 14:24:19 +01:00
appo	[RLlib] Unify gnorm mixin for tf and torch policies. (#26102 )	2022-07-24 15:31:09 +02:00
ars	[RLlib] More Trainer -> Algorithm renaming cleanups. (#25869 )	2022-06-20 15:54:00 +02:00
bandit	[RLlib] `Trainer` to `Algorithm` renaming. (#25539 )	2022-06-11 15:10:39 +02:00
bc	[RLlib]: Raise deprecation warning in MARWIL OPE methods. (#26893 )	2022-07-23 13:55:40 +02:00
cql	[RLlib]: Move OPE to evaluation config (#25911 )	2022-07-12 11:04:34 -07:00
crr	[RLlib] Fixes CRR flakeyness (#26770 )	2022-07-20 12:08:57 -07:00
ddpg	[RLlib] Fix a bunch of issues related to connectors. (#26510 )	2022-07-13 18:55:20 +02:00
ddppo	[RLlib] Cleanup some deprecated metric keys and classes. (#26036 )	2022-06-23 21:30:01 +02:00
dqn	[RLlib] Fix memory leak in APEX_DQN (#26691 )	2022-07-19 16:16:24 -07:00
dreamer	[RLlib] Simplify agent collector (#26803 )	2022-07-25 13:17:17 -07:00
es	[RLlib] More Trainer -> Algorithm renaming cleanups. (#25869 )	2022-06-20 15:54:00 +02:00
impala	[RLlib] Unify gnorm mixin for tf and torch policies. (#26102 )	2022-07-24 15:31:09 +02:00
maddpg	[RLlib] Save serialized PolicySpec. Extract `num_gpus` related logics into a util function. (#25954 )	2022-06-30 11:38:21 +02:00
maml	[RLlib] Fix a bunch of issues related to connectors. (#26510 )	2022-07-13 18:55:20 +02:00
marwil	[RLlib]: Raise deprecation warning in MARWIL OPE methods. (#26893 )	2022-07-23 13:55:40 +02:00
mbmpo	[RLlib] `Trainer` to `Algorithm` renaming. (#25539 )	2022-06-11 15:10:39 +02:00
pg	[RLlib] Fix a bunch of issues related to connectors. (#26510 )	2022-07-13 18:55:20 +02:00
ppo	[RLlib] Unify gnorm mixin for tf and torch policies. (#26102 )	2022-07-24 15:31:09 +02:00
qmix	[RLlib] Make QMix use the ReplayBufferAPI (#25560 )	2022-06-23 22:55:22 -07:00
r2d2	[RLlib] More Trainer -> Algorithm renaming cleanups. (#25869 )	2022-06-20 15:54:00 +02:00
sac	[RLlib] Migrating DDPG to PolicyV2. (#26054 )	2022-06-28 15:52:56 +02:00
simple_q	[RLlib] Fix a bunch of issues related to connectors. (#26510 )	2022-07-13 18:55:20 +02:00
slateq	[RLlib] `Trainer` to `Algorithm` renaming. (#25539 )	2022-06-11 15:10:39 +02:00
td3	[RLlib] More Trainer -> Algorithm renaming cleanups. (#25869 )	2022-06-20 15:54:00 +02:00
tests	[RLlib] Beef up worker failure test. (#26953 )	2022-07-27 00:10:45 -07:00
__init__.py	[RLlib] `Trainer` to `Algorithm` renaming. (#25539 )	2022-06-11 15:10:39 +02:00
algorithm.py	[RLlib] Beef up worker failure test. (#26953 )	2022-07-27 00:10:45 -07:00
algorithm_config.py	[RLlib] Beef up worker failure test. (#26953 )	2022-07-27 00:10:45 -07:00
callbacks.py	[RLlib] more connector polishes and fixes. (#26645 )	2022-07-19 08:50:28 -07:00
mock.py	[RLlib] `Trainer` to `Algorithm` renaming. (#25539 )	2022-06-11 15:10:39 +02:00
registry.py	[RLlib] Try to checkpoint a durable policy name (#27016 )	2022-07-27 00:01:14 -07:00