hiro/ray - Forgejo: Beyond coding. We Forge.

hiro/ray

mirror of https://github.com/vale981/ray synced 2025-03-12 14:16:39 -04:00

Author	SHA1	Message	Date
Sven Mika	7c39aa5fac	[RLlib] Trainer.training_iteration -> Trainer.training_step; Iterations vs reportings: Clarification of terms. (#25076 )	2022-06-10 17:09:18 +02:00
Rohan Potdar	a9d8da0100	[RLlib]: Doubly Robust Off-Policy Evaluation. (#25056 )	2022-06-07 12:52:19 +02:00
Rohan Potdar	ab81c8e9ca	[RLlib]: Rename `input_evaluation` to `off_policy_estimation_methods`. (#25107 )	2022-05-27 13:14:54 +02:00
Steven Morad	501d932449	[RLlib] SAC, RNNSAC, and CQL TrainerConfig objects (#25059 )	2022-05-22 19:58:47 +02:00
Artur Niederfahrenhorst	fb2915d26a	[RLlib] Replay Buffer API and Ape-X. (#24506 )	2022-05-17 13:43:49 +02:00
Sven Mika	0cd7bc4054	[RLlib] Re-establish dashboard performance tests. (#24728 )	2022-05-16 13:13:49 +02:00