Artur Niederfahrenhorst
|
a3f1323457
|
[RLlib] Make QMix use the ReplayBufferAPI (#25560)
|
2022-06-23 22:55:22 -07:00 |
|
Sven Mika
|
59a967a3a0
|
[RLlib] Cleanup some deprecated metric keys and classes. (#26036)
|
2022-06-23 21:30:01 +02:00 |
|
Artur Niederfahrenhorst
|
a322cc5765
|
[RLlib] IMPALA/APPO multi-agent mix-in-buffer fixes (plus MA learning tests). (#25848)
|
2022-06-17 14:10:36 +02:00 |
|
Yi Cheng
|
7b8b0f8e03
|
Revert "[RLlib] Remove execution plan code no longer used by RLlib. (#25624)" (#25776)
This reverts commit 804719876b .
|
2022-06-14 13:59:15 -07:00 |
|
Avnish Narayan
|
804719876b
|
[RLlib] Remove execution plan code no longer used by RLlib. (#25624)
|
2022-06-14 10:57:27 +02:00 |
|
Sven Mika
|
130b7eeaba
|
[RLlib] Trainer to Algorithm renaming. (#25539)
|
2022-06-11 15:10:39 +02:00 |
|
Artur Niederfahrenhorst
|
94d6c212df
|
[RLlib] Replay Buffer API documentation. (#24683)
|
2022-06-10 16:47:51 +02:00 |
|
Artur Niederfahrenhorst
|
c4a0e9d0f2
|
[RLlib] Disambiguate timestep fragment storage unit in replay buffers. (#25242)
|
2022-06-06 11:35:49 +02:00 |
|
Eric Liang
|
905258dbc1
|
Clean up docstyle in python modules and add LINT rule (#25272)
|
2022-06-01 11:27:54 -07:00 |
|
Artur Niederfahrenhorst
|
d76ef9add5
|
[RLLib] Fix RNNSAC example failing on CI + fixes for recurrent models for other Q Learning Algos. (#24923)
|
2022-05-24 14:39:43 +02:00 |
|
Artur Niederfahrenhorst
|
cd16dc4dae
|
[RLlib] Fix estimated buffer size in replay buffers. (#24848)
|
2022-05-22 21:03:23 +02:00 |
|
Steven Morad
|
501d932449
|
[RLlib] SAC, RNNSAC, and CQL TrainerConfig objects (#25059)
|
2022-05-22 19:58:47 +02:00 |
|
Artur Niederfahrenhorst
|
fb2915d26a
|
[RLlib] Replay Buffer API and Ape-X. (#24506)
|
2022-05-17 13:43:49 +02:00 |
|
Max Pumperla
|
6a6c58b5b4
|
[RLlib] Config objects for DDPG and SimpleQ. (#24339)
|
2022-05-12 16:12:42 +02:00 |
|
Artur Niederfahrenhorst
|
95d4a83a87
|
[RLlib] R2D2 Replay Buffer API integration. (#24473)
|
2022-05-10 20:36:14 +02:00 |
|
Sven Mika
|
44a51610c2
|
[RLlib] SlateQ config objects. (#24577)
|
2022-05-10 20:07:18 +02:00 |
|
Artur Niederfahrenhorst
|
8d906f9bf8
|
[RLlib] SAC with new Replay Buffer API. (#24156)
|
2022-05-09 14:33:02 +02:00 |
|
Artur Niederfahrenhorst
|
bd2fdf4752
|
[RLlib] Automate sequences in timeslice_along_seq_lens_with_overlap() . (#24561)
|
2022-05-09 11:55:06 +02:00 |
|
Avnish Narayan
|
f2bb6f6806
|
[RLlib] Impala training iteration fn (#23454)
|
2022-05-05 16:11:08 +02:00 |
|
Artur Niederfahrenhorst
|
86bc9ecce2
|
[RLlib] DDPG Training iteration fn & Replay Buffer API (#24212)
|
2022-05-05 09:41:38 +02:00 |
|
Sven Mika
|
627b9f2e88
|
[RLlib] QMIX training iteration function and new replay buffer API. (#24164)
|
2022-04-27 14:24:20 +02:00 |
|
Sven Mika
|
bb4e5cb70a
|
[RLlib] CQL: training iteration function. (#24166)
|
2022-04-26 14:28:39 +02:00 |
|
Artur Niederfahrenhorst
|
e57ce7efd6
|
[RLlib] Replay Buffer API and Training Iteration Fn for DQN. (#23420)
|
2022-04-18 12:20:12 +02:00 |
|
Artur Niederfahrenhorst
|
02a50f02b7
|
[RLlib] RepayBuffer: _hit_counts working again. (#23586)
|
2022-04-07 10:56:25 +02:00 |
|
Artur Niederfahrenhorst
|
9a64bd4e9b
|
[RLlib] Simple-Q uses training iteration fn (instead of execution_plan); ReplayBuffer API for Simple-Q (#22842)
|
2022-03-29 14:44:40 +02:00 |
|
Artur Niederfahrenhorst
|
32ad6c6ef1
|
[RLlib] Replay Buffer capacity check (#23523)
|
2022-03-29 12:06:27 +02:00 |
|
Siyuan (Ryans) Zhuang
|
0c74ecad12
|
[Lint] Cleanup incorrectly formatted strings (Part 1: RLLib). (#23128)
|
2022-03-15 17:34:21 +01:00 |
|
Artur Niederfahrenhorst
|
37d129a965
|
[RLlib] ReplayBuffer API: Test cases. (#22390)
|
2022-03-08 16:54:12 +01:00 |
|
Artur Niederfahrenhorst
|
dea3574050
|
[RLlib] Replay Buffer API (#22114)
|
2022-02-09 15:04:43 +01:00 |
|