kourosh hakhamaneshi
|
4607e788c1
|
[RLlib] Fix test_ope flakiness (#27676)
|
2022-08-09 16:12:30 -07:00 |
|
Rohan Potdar
|
5b6a58ed28
|
[RLlib] Add OPE Learning Tests (#27154)
|
2022-08-02 17:51:38 -07:00 |
|
Jun Gong
|
6b6d3017ba
|
[RLlib] more connector polishes and fixes. (#26645)
|
2022-07-19 08:50:28 -07:00 |
|
Sven Mika
|
a8494742a3
|
[RLlib] Memory leak finding toolset using tracemalloc + CI memory leak tests. (#15412)
|
2022-04-12 07:50:09 +02:00 |
|
Balaji Veeramani
|
7f1bacc7dc
|
[CI] Format Python code with Black (#21975)
See #21316 and #21311 for the motivation behind these changes.
|
2022-01-29 18:41:57 -08:00 |
|
Sven Mika
|
daa4304a91
|
[RLlib] Switch off preprocessors by default for PGTrainer. (#21008)
|
2021-12-13 12:04:23 +01:00 |
|
Sven Mika
|
61a1274619
|
[RLlib] No Preprocessors (part 2). (#18468)
|
2021-09-23 12:56:45 +02:00 |
|
Sven Mika
|
a96dbd885b
|
[RLlib] Reinstate trajectory view API tests. (#18809)
|
2021-09-23 08:31:51 +02:00 |
|
simonsays1980
|
60aee4a330
|
[RLlib] Add example script for bare metal Policy with custom view_requirements . (#17896)
|
2021-08-20 12:17:13 +02:00 |
|
Richard Liaw
|
a78a2263e5
|
[RLlib] Fix reverted RockPaperScissors Pettingzoo example (#16896)
|
2021-07-22 10:55:07 -04:00 |
|
Amog Kamsetty
|
ecb632140f
|
Revert "RockPaperScissors Pettingzoo" (#16886)
This reverts commit bf3e3225b6 .
|
2021-07-06 09:43:47 -07:00 |
|
Rodrigo de Lazcano
|
bf3e3225b6
|
RockPaperScissors Pettingzoo (#16725)
|
2021-07-05 09:52:08 -07:00 |
|
Sven Mika
|
391cdfae8c
|
[RLlib] Trajectory view API docs. (#12718)
|
2020-12-30 17:32:21 -08:00 |
|
Sven Mika
|
99c81c6795
|
[RLlib] Attention Net prep PR #3. (#12450)
|
2020-12-07 13:08:17 +01:00 |
|
Sven Mika
|
3ad9365e1d
|
[RLlib] Attention Net prep PR #2: Smaller cleanups. (#12449)
|
2020-12-01 08:21:45 +01:00 |
|
Sven Mika
|
0df55a139c
|
[RLlib] Attention Net prep PR #1: Smaller cleanups. (#12447)
* WIP.
* Fix.
* Fix.
* Fix.
|
2020-11-27 16:25:47 -08:00 |
|
Sven Mika
|
62c7ab5182
|
[RLlib] Trajectory view API: Enable by default for PPO, IMPALA, PG, A3C (tf and torch). (#11747)
|
2020-11-12 16:27:34 +01:00 |
|
desktable
|
5af745c90d
|
[RLlib] Implement the SlateQ algorithm (#11450)
|
2020-11-03 09:52:04 +01:00 |
|
Sven Mika
|
36bda8432b
|
[RLlib] Trajectory view API: Simple List Collector (on by default for PPO); LSTM-agnostic (#11056)
|
2020-10-01 16:57:10 +02:00 |
|
Sven Mika
|
e968b52cb7
|
[RLlib] Trajectory view API - 03 Fast LSTM + prev actions/rewards (#9950)
|
2020-08-21 12:35:16 +02:00 |
|
Barak Michener
|
8e76796fd0
|
ci: Redo format.sh --all script & backfill lint fixes (#9956)
|
2020-08-07 16:49:49 -07:00 |
|
Sven Mika
|
e6ea33a03c
|
[RLlib] Enhance reward clipping test; add action_clipping tests. (#9684)
|
2020-07-28 10:44:54 +02:00 |
|
Sven Mika
|
5f278c6411
|
[RLlib] Examples folder restructuring (models) part 1 (#8353)
|
2020-05-08 08:20:18 +02:00 |
|