Sven Mika
|
7c39aa5fac
|
[RLlib] Trainer.training_iteration -> Trainer.training_step; Iterations vs reportings: Clarification of terms. (#25076)
|
2022-06-10 17:09:18 +02:00 |
|
Sven Mika
|
b5bc2b93c3
|
[RLlib] Move all remaining algos into algorithms directory. (#25366)
|
2022-06-04 07:35:24 +02:00 |
|
Artur Niederfahrenhorst
|
d76ef9add5
|
[RLLib] Fix RNNSAC example failing on CI + fixes for recurrent models for other Q Learning Algos. (#24923)
|
2022-05-24 14:39:43 +02:00 |
|
Artur Niederfahrenhorst
|
fb2915d26a
|
[RLlib] Replay Buffer API and Ape-X. (#24506)
|
2022-05-17 13:43:49 +02:00 |
|
Artur Niederfahrenhorst
|
95d4a83a87
|
[RLlib] R2D2 Replay Buffer API integration. (#24473)
|
2022-05-10 20:36:14 +02:00 |
|
Sven Mika
|
f066180ed5
|
[RLlib] Deprecate timesteps_per_iteration config key (in favor of min_[sample|train]_timesteps_per_reporting . (#24372)
|
2022-05-02 12:51:14 +02:00 |
|
Avnish Narayan
|
477b9d22d2
|
[RLlib][Training iteration fn] APEX conversion (#22937)
|
2022-04-20 17:56:18 +02:00 |
|
Sven Mika
|
a8494742a3
|
[RLlib] Memory leak finding toolset using tracemalloc + CI memory leak tests. (#15412)
|
2022-04-12 07:50:09 +02:00 |
|
Artur Niederfahrenhorst
|
9a64bd4e9b
|
[RLlib] Simple-Q uses training iteration fn (instead of execution_plan); ReplayBuffer API for Simple-Q (#22842)
|
2022-03-29 14:44:40 +02:00 |
|
Sven Mika
|
d5bfb7b7da
|
[RLlib] Preparatory PR for multi-agent multi-GPU learner (alpha-star style) #03 (#21652)
|
2022-01-25 14:16:58 +01:00 |
|
Sven Mika
|
08c09737fa
|
[RLlib] Fix R2D2 (torch) multi-GPU issue. (#18550)
|
2021-09-14 19:58:10 +02:00 |
|
Sven Mika
|
e3e6ed7aaa
|
[RLlib] Issues 17844, 18034: Fix n-step > 1 bug. (#18358)
|
2021-09-06 12:14:20 +02:00 |
|
Sven Mika
|
599e589481
|
[RLlib] Move existing fake multi-GPU learning tests into separate buildkite job. (#18065)
|
2021-08-31 14:56:53 +02:00 |
|
Sven Mika
|
2d34216660
|
[RLlib] APEX-DQN: Bug fix for torch and add learning test. (#15762)
|
2021-05-20 09:27:03 +02:00 |
|
Sven Mika
|
6f4d988713
|
[RLlib] Issue 15556: Fix R2D2 using chunks from previous episodes in the "burn-in" window. (#15737)
|
2021-05-18 11:05:42 +02:00 |
|
Eric Liang
|
af8a93f2a4
|
Deflake some RLlib tests (#14947)
* fix
* update
* 100
* flake
|
2021-03-26 11:45:17 -07:00 |
|
Sven Mika
|
8000258333
|
[RLlib] R2D2 Implementation. (#13933)
|
2021-02-25 12:18:11 +01:00 |
|
Sven Mika
|
deb33bce84
|
[RLlib] Add DQN SoftQ learning test case. (#12712)
|
2020-12-10 14:55:19 +01:00 |
|
Sven Mika
|
b6b54f1c81
|
[RLlib] Trajectory view API: enable by default for SAC, DDPG, DQN, SimpleQ (#11827)
|
2020-11-16 10:54:35 -08:00 |
|
Sven Mika
|
5b788ccb13
|
[RLlib] Trajectory view API (prep PR for switching on by default across all RLlib; plumbing only) (#11717)
|
2020-11-03 12:53:34 -08:00 |
|
Sven Mika
|
2746fc0476
|
[RLlib] Auto-framework, retire use_pytorch in favor of framework=... (#8520)
|
2020-05-27 16:19:13 +02:00 |
|
Sven Mika
|
baa053496a
|
[RLlib] Benchmark and regression test yaml cleanup and restructuring. (#8414)
|
2020-05-26 11:10:27 +02:00 |
|