Rohan Potdar
69f6b843da
[RLlib] Test output length in DatasetReader with default IOContext. ( #26852 )
2022-07-23 13:53:59 +02:00
Rohan Potdar
2b13ac85f9
[RLLib]: Make IOContext optional for DatasetReader ( #26694 )
2022-07-21 13:05:00 -07:00
Rohan Potdar
4fded80813
[RLlib]: Fix FQE Policy call ( #26671 )
2022-07-19 00:58:31 -07:00
Rohan Potdar
38c9e1d52a
[RLlib]: Fix OPE trainables ( #26279 )
...
Co-authored-by: Kourosh Hakhamaneshi <kourosh@anyscale.com>
2022-07-17 14:25:53 -07:00
kourosh hakhamaneshi
569fe01096
[RLlib] improved unittests for dataset_reader and fixed bugs ( #26458 )
2022-07-17 13:38:15 -07:00
Rohan Potdar
09ce4711fd
[RLlib]: Move OPE to evaluation config ( #25911 )
2022-07-12 11:04:34 -07:00
kourosh hakhamaneshi
be6e4c644f
[RLlib] Feature importance evaluation for offline RL ( #26412 )
2022-07-11 18:12:50 -07:00
Jun Gong
0c469e490e
[RLlib] Checkpoint and restore connectors. ( #26253 )
2022-07-09 01:06:24 -07:00
Avnish Narayan
1243ed62bf
[RLlib] Make Dataset reader default reader and enable CRR to use dataset ( #26304 )
...
Co-authored-by: avnish <avnish@avnishs-MBP.local.meter>
2022-07-08 12:43:35 -07:00
Avnish Narayan
1f9282a496
[RLlib, Offline] Make the dataset and json readers batchable ( #26055 )
...
Make the dataset and json readers batchable.
2022-06-29 11:52:40 -07:00
Rohan Potdar
28df3f34f5
[RLlib]: Off-Policy Evaluation fixes. ( #25899 )
2022-06-21 13:24:24 +02:00
Sven Mika
96693055bd
[RLlib] More Trainer -> Algorithm renaming cleanups. ( #25869 )
2022-06-20 15:54:00 +02:00
Sven Mika
130b7eeaba
[RLlib] Trainer
to Algorithm
renaming. ( #25539 )
2022-06-11 15:10:39 +02:00
Jun Gong
9b65d5535d
[RLlib] Introduce basic connectors library. ( #25311 )
2022-06-07 19:18:14 +02:00
Rohan Potdar
a9d8da0100
[RLlib]: Doubly Robust Off-Policy Evaluation. ( #25056 )
2022-06-07 12:52:19 +02:00
Jun Gong
1d24d6af98
[RLlib] Fix MARWIL tf policy. ( #25384 )
2022-06-03 10:50:36 +02:00
Eric Liang
905258dbc1
Clean up docstyle in python modules and add LINT rule ( #25272 )
2022-06-01 11:27:54 -07:00
Rohan Potdar
ab81c8e9ca
[RLlib]: Rename input_evaluation
to off_policy_estimation_methods
. ( #25107 )
2022-05-27 13:14:54 +02:00
Eric Liang
4963dfaae0
[api] Add API stability annotations for all RLlib symbols and add to LINT ( #25060 )
2022-05-24 22:14:25 -07:00
Eric Liang
55d039af32
Annotate datasources and add API annotation check script ( #24999 )
...
Why are these changes needed?
Add API stability annotations for datasource classes, and add a linter to check all data classes have appropriate annotations.
2022-05-21 15:05:07 -07:00
Rohan Potdar
5a70b732e8
[RLlib] MARWIL and BC Config. ( #24853 )
2022-05-21 12:50:20 +02:00
Sven Mika
7cca7782f1
[RLlib] OPE (off policy estimator) API. ( #24384 )
2022-05-02 21:15:50 +02:00
Kai Fricke
8c2e471265
[AIR] Add RLTrainer interface, implementation, and examples ( #23465 )
...
This PR adds a RLTrainer to Ray AIR. It works for both offline and online use cases. In offline training, it will leverage the datasets key of the Trainer API to specify a dataset reader input, used e.g. in Behavioral Cloning (BC). In online training, it is a wrapper around the rllib trainables making use of the parameter layering enabled by the Trainer API.
2022-04-08 17:16:42 -07:00
Kai Fricke
262d6121bb
[rllib] Fix error messages and example for dataset writer ( #23419 )
...
Currently the error message and example refer to a field type that is actually format.
2022-03-28 19:53:12 +01:00
Max Pumperla
60054995e6
[docs] fix doctests and activate CI ( #23418 )
2022-03-24 17:04:02 -07:00
Sven Mika
0af100ffae
[RLlib] Fix tree.flatten dict ordering bug: flatten_space([obs_space])
should produce same struct as tree.flatten([obs])
. ( #22731 )
2022-03-01 21:24:24 +01:00
Jun Gong
6f5afcbce9
[RLlib] Docs enhancements: Setup-dev instructions; Ray datasets integration. ( #22239 )
2022-02-15 09:09:24 +01:00
Jun Gong
87fe033f7b
[RLlib] Request CPU resources in Trainer.default_resource_request()
if using dataset input. ( #21948 )
2022-02-02 10:20:37 +01:00
Balaji Veeramani
7f1bacc7dc
[CI] Format Python code with Black ( #21975 )
...
See #21316 and #21311 for the motivation behind these changes.
2022-01-29 18:41:57 -08:00
Jun Gong
099c170ab4
[RLlib] Dataset Reader/Writer for RLlib ( #21808 )
2022-01-26 16:00:46 +01:00
xwjiang2010
9af8f11191
Revert "[docs] Clean up doc structure (first part) ( #21667 )" ( #21763 )
...
This reverts commit 38e46c9fb3
.
2022-01-20 15:30:56 -08:00
Max Pumperla
38e46c9fb3
[docs] Clean up doc structure (first part) ( #21667 )
2022-01-20 16:19:04 +01:00
Sven Mika
596c8e2772
[RLlib] Experimental no-flatten option for actions/prev-actions. ( #20918 )
2021-12-11 14:57:58 +01:00
Sven Mika
f814c2af89
[RLlib; Docs] Docs API reference pages: rllib/execution
, rllib/evaluation
, rllib/models
, rllib/offline
. ( #20538 )
2021-12-10 09:41:29 +01:00
Sungho Joo
dc51af798c
[RLlib] Minor fix on json encoding during worker sampling ( #20134 )
...
* import custom json encoder from util and improve encoder default function
* linting
2021-11-09 16:46:41 -08:00
Sven Mika
ea2bea7e30
[RLlib; Docs overhaul] Docstring cleanup: Offline. ( #19808 )
2021-11-01 10:59:53 +01:00
Sven Mika
cabaa3b3c6
[RLlib Testing] Add A3C/APPO/BC/DDPPO/MARWIL/CQL/ES/ARS/TD3 to weekly learning tests. ( #18381 )
2021-09-07 11:48:41 +02:00
Sven Mika
18d173b172
[RLlib] Implement policy_maps (multi-agent case) in RolloutWorkers as LRU caches. ( #17031 )
2021-07-19 13:16:03 -04:00
Sven Mika
1fd0eb805e
[RLlib] Redo fix bug normalize vs unsquash actions (original PR made log-likelihood test flakey). ( #17014 )
2021-07-13 14:01:30 -04:00
Amog Kamsetty
bc33dc7e96
Revert "[RLlib] Fix bug in policy.py: normalize_actions=True has to call unsquash_action
, not normalize_action
." ( #17002 )
...
This reverts commit 7862dd64ea
.
2021-07-12 11:09:14 -07:00
Julius Frost
a88b217d3f
[rllib] Enhancements to Input API for customizing offline datasets ( #16957 )
...
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
2021-07-10 15:05:25 -07:00
Sven Mika
7862dd64ea
[RLlib] Fix bug in policy.py: normalize_actions=True has to call unsquash_action
, not normalize_action
. ( #16774 )
2021-07-08 17:31:34 +02:00
Sven Mika
53206dd440
[RLlib] CQL BC loss fixes; PPO/PG/A2|3C action normalization fixes ( #16531 )
2021-06-30 12:32:11 +02:00
Sven Mika
e2be41b407
[RLlib] MARWIL + BC: Various fixes and enhancements. ( #16218 )
2021-06-03 22:29:00 +02:00
Sven Mika
c4a3e1589b
[RLlib] CQL: Bug fixes and OPE example added to test and offline_rl.py example. ( #15761 )
2021-05-13 09:17:23 +02:00
Michael Luo
4cbe13cdfd
[RLlib] CQL loss fn fixes, MuJoCo + Pendulum benchmarks, offline-RL example script w/ json file. ( #15603 )
...
Co-authored-by: Sven Mika <sven@anyscale.io>
Co-authored-by: sven1977 <svenmika1977@gmail.com>
2021-05-04 19:06:19 +02:00
Sven Mika
8b3554e37e
[RLlib] Remove all (already soft-deprecated) SampleBatch.data
from code. ( #15335 )
2021-04-15 19:19:51 +02:00
Sven Mika
6708211b59
[RLlib] JSONReader: Mix files if > 1 at beginning (each worker should start with different file). ( #14865 )
2021-03-24 16:07:40 +01:00
Michael Luo
587f207c2f
[RLlib] Support for D4RL + Semi-working CQL Benchmark ( #13550 )
2021-01-21 16:43:55 +01:00
Sven Mika
391cdfae8c
[RLlib] Trajectory view API docs. ( #12718 )
2020-12-30 17:32:21 -08:00