Commit graph

23 commits

Author SHA1 Message Date
kourosh hakhamaneshi
4607e788c1
[RLlib] Fix test_ope flakiness (#27676) 2022-08-09 16:12:30 -07:00
Rohan Potdar
5b6a58ed28
[RLlib] Add OPE Learning Tests (#27154) 2022-08-02 17:51:38 -07:00
Jun Gong
6b6d3017ba
[RLlib] more connector polishes and fixes. (#26645) 2022-07-19 08:50:28 -07:00
Sven Mika
a8494742a3
[RLlib] Memory leak finding toolset using tracemalloc + CI memory leak tests. (#15412) 2022-04-12 07:50:09 +02:00
Balaji Veeramani
7f1bacc7dc
[CI] Format Python code with Black (#21975)
See #21316 and #21311 for the motivation behind these changes.
2022-01-29 18:41:57 -08:00
Sven Mika
daa4304a91
[RLlib] Switch off preprocessors by default for PGTrainer. (#21008) 2021-12-13 12:04:23 +01:00
Sven Mika
61a1274619
[RLlib] No Preprocessors (part 2). (#18468) 2021-09-23 12:56:45 +02:00
Sven Mika
a96dbd885b
[RLlib] Reinstate trajectory view API tests. (#18809) 2021-09-23 08:31:51 +02:00
simonsays1980
60aee4a330
[RLlib] Add example script for bare metal Policy with custom view_requirements. (#17896) 2021-08-20 12:17:13 +02:00
Richard Liaw
a78a2263e5
[RLlib] Fix reverted RockPaperScissors Pettingzoo example (#16896) 2021-07-22 10:55:07 -04:00
Amog Kamsetty
ecb632140f
Revert "RockPaperScissors Pettingzoo" (#16886)
This reverts commit bf3e3225b6.
2021-07-06 09:43:47 -07:00
Rodrigo de Lazcano
bf3e3225b6
RockPaperScissors Pettingzoo (#16725) 2021-07-05 09:52:08 -07:00
Sven Mika
391cdfae8c
[RLlib] Trajectory view API docs. (#12718) 2020-12-30 17:32:21 -08:00
Sven Mika
99c81c6795
[RLlib] Attention Net prep PR #3. (#12450) 2020-12-07 13:08:17 +01:00
Sven Mika
3ad9365e1d
[RLlib] Attention Net prep PR #2: Smaller cleanups. (#12449) 2020-12-01 08:21:45 +01:00
Sven Mika
0df55a139c
[RLlib] Attention Net prep PR #1: Smaller cleanups. (#12447)
* WIP.

* Fix.

* Fix.

* Fix.
2020-11-27 16:25:47 -08:00
Sven Mika
62c7ab5182
[RLlib] Trajectory view API: Enable by default for PPO, IMPALA, PG, A3C (tf and torch). (#11747) 2020-11-12 16:27:34 +01:00
desktable
5af745c90d
[RLlib] Implement the SlateQ algorithm (#11450) 2020-11-03 09:52:04 +01:00
Sven Mika
36bda8432b
[RLlib] Trajectory view API: Simple List Collector (on by default for PPO); LSTM-agnostic (#11056) 2020-10-01 16:57:10 +02:00
Sven Mika
e968b52cb7
[RLlib] Trajectory view API - 03 Fast LSTM + prev actions/rewards (#9950) 2020-08-21 12:35:16 +02:00
Barak Michener
8e76796fd0
ci: Redo format.sh --all script & backfill lint fixes (#9956) 2020-08-07 16:49:49 -07:00
Sven Mika
e6ea33a03c
[RLlib] Enhance reward clipping test; add action_clipping tests. (#9684) 2020-07-28 10:44:54 +02:00
Sven Mika
5f278c6411
[RLlib] Examples folder restructuring (models) part 1 (#8353) 2020-05-08 08:20:18 +02:00