Commit graph

13 commits

Author SHA1 Message Date
Artur Niederfahrenhorst
0dceddb912
[RLlib] Move learning_starts logic from buffers into training_step(). (#26032) 2022-08-11 13:07:30 +02:00
Artur Niederfahrenhorst
04bc845360
[RLlib] Fix priority update for sequenced batches. (#27544) 2022-08-10 12:48:25 +02:00
Jun Gong
acf2bf9b2f
[RLlib] Get rid of all these deprecation warnings. (#27085) 2022-07-27 10:48:54 -07:00
Sven Mika
130b7eeaba
[RLlib] Trainer to Algorithm renaming. (#25539) 2022-06-11 15:10:39 +02:00
Artur Niederfahrenhorst
94d6c212df
[RLlib] Replay Buffer API documentation. (#24683) 2022-06-10 16:47:51 +02:00
Artur Niederfahrenhorst
d76ef9add5
[RLLib] Fix RNNSAC example failing on CI + fixes for recurrent models for other Q Learning Algos. (#24923) 2022-05-24 14:39:43 +02:00
Steven Morad
501d932449
[RLlib] SAC, RNNSAC, and CQL TrainerConfig objects (#25059) 2022-05-22 19:58:47 +02:00
Artur Niederfahrenhorst
fb2915d26a
[RLlib] Replay Buffer API and Ape-X. (#24506) 2022-05-17 13:43:49 +02:00
Sven Mika
44a51610c2
[RLlib] SlateQ config objects. (#24577) 2022-05-10 20:07:18 +02:00
Artur Niederfahrenhorst
8d906f9bf8
[RLlib] SAC with new Replay Buffer API. (#24156) 2022-05-09 14:33:02 +02:00
Artur Niederfahrenhorst
86bc9ecce2
[RLlib] DDPG Training iteration fn & Replay Buffer API (#24212) 2022-05-05 09:41:38 +02:00
Sven Mika
627b9f2e88
[RLlib] QMIX training iteration function and new replay buffer API. (#24164) 2022-04-27 14:24:20 +02:00
Sven Mika
bb4e5cb70a
[RLlib] CQL: training iteration function. (#24166) 2022-04-26 14:28:39 +02:00