1
0
Fork 0
mirror of https://github.com/vale981/ray synced 2025-03-12 22:26:39 -04:00
Commit graph

22 commits

Author SHA1 Message Date
Jun Gong
b383d987d1
[RLlib] Fix a bunch of issues related to connectors. () 2022-07-13 18:55:20 +02:00
Rohan Potdar
09ce4711fd
[RLlib]: Move OPE to evaluation config () 2022-07-12 11:04:34 -07:00
Avnish Narayan
1243ed62bf
[RLlib] Make Dataset reader default reader and enable CRR to use dataset ()
Co-authored-by: avnish <avnish@avnishs-MBP.local.meter>
2022-07-08 12:43:35 -07:00
Sven Mika
59a967a3a0
[RLlib] Cleanup some deprecated metric keys and classes. () 2022-06-23 21:30:01 +02:00
Sven Mika
96693055bd
[RLlib] More Trainer -> Algorithm renaming cleanups. () 2022-06-20 15:54:00 +02:00
Sven Mika
130b7eeaba
[RLlib] Trainer to Algorithm renaming. () 2022-06-11 15:10:39 +02:00
Sven Mika
7c39aa5fac
[RLlib] Trainer.training_iteration -> Trainer.training_step; Iterations vs reportings: Clarification of terms. () 2022-06-10 17:09:18 +02:00
Artur Niederfahrenhorst
94d6c212df
[RLlib] Replay Buffer API documentation. () 2022-06-10 16:47:51 +02:00
Rohan Potdar
a9d8da0100
[RLlib]: Doubly Robust Off-Policy Evaluation. () 2022-06-07 12:52:19 +02:00
Sven Mika
b5bc2b93c3
[RLlib] Move all remaining algos into algorithms directory. () 2022-06-04 07:35:24 +02:00
Jun Gong
1d24d6af98
[RLlib] Fix MARWIL tf policy. () 2022-06-03 10:50:36 +02:00
Rohan Potdar
ab81c8e9ca
[RLlib]: Rename input_evaluation to off_policy_estimation_methods. () 2022-05-27 13:14:54 +02:00
Jun Gong
eaf9c941ae
[RLlib] Migrate PPO Impala and APPO policies to use sub-classing implementation. () 2022-05-25 14:38:03 +02:00
Artur Niederfahrenhorst
d76ef9add5
[RLLib] Fix RNNSAC example failing on CI + fixes for recurrent models for other Q Learning Algos. () 2022-05-24 14:39:43 +02:00
Rohan Potdar
5a70b732e8
[RLlib] MARWIL and BC Config. () 2022-05-21 12:50:20 +02:00
Jun Gong
d5a6d46049
[RLlib] Migrate MAML, MB-MPO, MARWIL, and BC to use Policy sub-classing implementation. () 2022-05-20 14:10:59 +02:00
Jun Gong
dea134a472
[RLlib] Clean up Policy mixins. () 2022-05-17 17:16:08 +02:00
Artur Niederfahrenhorst
fb2915d26a
[RLlib] Replay Buffer API and Ape-X. () 2022-05-17 13:43:49 +02:00
Sven Mika
0cd7bc4054
[RLlib] Re-establish dashboard performance tests. () 2022-05-16 13:13:49 +02:00
Jun Gong
68a9a33386
[RLlib] Retry agents -> algorithms. with proper doc changes this time. () 2022-05-16 09:45:32 +02:00
Simon Mo
9f23affdc0
[Hotfix] Unbreak lint in master () 2022-05-13 15:05:05 -07:00
kourosh hakhamaneshi
ffcbb30552
[RLlib] Move from agents to algorithms - CQL, MARWIL, AlphaStar, MAML, Dreamer, MBMPO. () 2022-05-13 18:43:36 +02:00