ray/rllib/offline at ba1c489b79409ca61b1e046e6e684410e9a94629 - hiro/ray

hiro/ray

mirror of https://github.com/vale981/ray synced 2025-03-06 10:31:39 -05:00

History

Sven Mika cabaa3b3c6 [RLlib Testing] Add A3C/APPO/BC/DDPPO/MARWIL/CQL/ES/ARS/TD3 to weekly learning tests. (#18381 )		2021-09-07 11:48:41 +02:00
..
__init__.py	[RLlib] Support for D4RL + Semi-working CQL Benchmark (#13550 )	2021-01-21 16:43:55 +01:00
d4rl_reader.py	[RLlib] CQL loss fn fixes, MuJoCo + Pendulum benchmarks, offline-RL example script w/ json file. (#15603 )	2021-05-04 19:06:19 +02:00
input_reader.py	[RLlib] Trajectory view API docs. (#12718 )	2020-12-30 17:32:21 -08:00
io_context.py	[rllib] Enhancements to Input API for customizing offline datasets (#16957 )	2021-07-10 15:05:25 -07:00
is_estimator.py	[RLlib] CQL: Bug fixes and OPE example added to test and offline_rl.py example. (#15761 )	2021-05-13 09:17:23 +02:00
json_reader.py	[RLlib Testing] Add A3C/APPO/BC/DDPPO/MARWIL/CQL/ES/ARS/TD3 to weekly learning tests. (#18381 )	2021-09-07 11:48:41 +02:00
json_writer.py	[RLlib] Remove all (already soft-deprecated) `SampleBatch.data` from code. (#15335 )	2021-04-15 19:19:51 +02:00
mixed_input.py	[rllib] Enhancements to Input API for customizing offline datasets (#16957 )	2021-07-10 15:05:25 -07:00
off_policy_estimator.py	[RLlib] Redo fix bug normalize vs unsquash actions (original PR made log-likelihood test flakey). (#17014 )	2021-07-13 14:01:30 -04:00
output_writer.py	[RLlib] SAC algo cleanup. (#10825 )	2020-09-20 11:27:02 +02:00
shuffled_input.py	[RLlib] SAC algo cleanup. (#10825 )	2020-09-20 11:27:02 +02:00
wis_estimator.py	[RLlib] In OffPolicyEstimators (Offline RL): Include last step of trajectory (#12619 )	2020-12-08 12:39:40 +01:00