Artur Niederfahrenhorst
|
0dceddb912
|
[RLlib] Move learning_starts logic from buffers into training_step() . (#26032)
|
2022-08-11 13:07:30 +02:00 |
|
Artur Niederfahrenhorst
|
04bc845360
|
[RLlib] Fix priority update for sequenced batches. (#27544)
|
2022-08-10 12:48:25 +02:00 |
|
Jun Gong
|
acf2bf9b2f
|
[RLlib] Get rid of all these deprecation warnings. (#27085)
|
2022-07-27 10:48:54 -07:00 |
|
Sven Mika
|
130b7eeaba
|
[RLlib] Trainer to Algorithm renaming. (#25539)
|
2022-06-11 15:10:39 +02:00 |
|
Artur Niederfahrenhorst
|
94d6c212df
|
[RLlib] Replay Buffer API documentation. (#24683)
|
2022-06-10 16:47:51 +02:00 |
|
Artur Niederfahrenhorst
|
d76ef9add5
|
[RLLib] Fix RNNSAC example failing on CI + fixes for recurrent models for other Q Learning Algos. (#24923)
|
2022-05-24 14:39:43 +02:00 |
|
Steven Morad
|
501d932449
|
[RLlib] SAC, RNNSAC, and CQL TrainerConfig objects (#25059)
|
2022-05-22 19:58:47 +02:00 |
|
Artur Niederfahrenhorst
|
fb2915d26a
|
[RLlib] Replay Buffer API and Ape-X. (#24506)
|
2022-05-17 13:43:49 +02:00 |
|
Sven Mika
|
44a51610c2
|
[RLlib] SlateQ config objects. (#24577)
|
2022-05-10 20:07:18 +02:00 |
|
Artur Niederfahrenhorst
|
8d906f9bf8
|
[RLlib] SAC with new Replay Buffer API. (#24156)
|
2022-05-09 14:33:02 +02:00 |
|
Artur Niederfahrenhorst
|
86bc9ecce2
|
[RLlib] DDPG Training iteration fn & Replay Buffer API (#24212)
|
2022-05-05 09:41:38 +02:00 |
|
Sven Mika
|
627b9f2e88
|
[RLlib] QMIX training iteration function and new replay buffer API. (#24164)
|
2022-04-27 14:24:20 +02:00 |
|
Sven Mika
|
bb4e5cb70a
|
[RLlib] CQL: training iteration function. (#24166)
|
2022-04-26 14:28:39 +02:00 |
|