Sven Mika
|
7c39aa5fac
|
[RLlib] Trainer.training_iteration -> Trainer.training_step; Iterations vs reportings: Clarification of terms. (#25076)
|
2022-06-10 17:09:18 +02:00 |
|
Rohan Potdar
|
a9d8da0100
|
[RLlib]: Doubly Robust Off-Policy Evaluation. (#25056)
|
2022-06-07 12:52:19 +02:00 |
|
Rohan Potdar
|
ab81c8e9ca
|
[RLlib]: Rename input_evaluation to off_policy_estimation_methods . (#25107)
|
2022-05-27 13:14:54 +02:00 |
|
Steven Morad
|
501d932449
|
[RLlib] SAC, RNNSAC, and CQL TrainerConfig objects (#25059)
|
2022-05-22 19:58:47 +02:00 |
|
Artur Niederfahrenhorst
|
fb2915d26a
|
[RLlib] Replay Buffer API and Ape-X. (#24506)
|
2022-05-17 13:43:49 +02:00 |
|
Sven Mika
|
0cd7bc4054
|
[RLlib] Re-establish dashboard performance tests. (#24728)
|
2022-05-16 13:13:49 +02:00 |
|