Commit graph

1402 commits

Author SHA1 Message Date
Artur Niederfahrenhorst
56e7800e0b
[RLlib] Tolerate nan metrics in LearnerInfoBuilder. (#27981) 2022-08-23 10:07:32 -07:00
Artur Niederfahrenhorst
7ddd14b5db
[RLlib] Fix PPOTorchPolicy producing float metrics when not using critic. (#27980) 2022-08-22 09:41:36 -07:00
Jun Gong
62b91cbec0
[docs][rllib] Documentation for connectors. (#27528)
Co-authored-by: Kourosh Hakhamaneshi <kourosh@anyscale.com>
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
2022-08-19 14:35:07 -07:00
Jun Gong
ec38b96eba
[RLlib] quick fix for learning rate schedule for APPO algorithm. (#28013) 2022-08-19 14:34:34 -07:00
Charles Sun
edde905741
[RLlib] Add Decision Transformer (DT) (#27890) 2022-08-17 13:49:13 -07:00
Artur Niederfahrenhorst
f7b4c5a7ec
[RLlib] Remove unneeded args from offline learning examples. (#26666) 2022-08-17 17:59:27 +02:00
Charles Sun
9330d8f244
[RLlib] Add DTTorchPolicy (#27889) 2022-08-17 00:28:00 -07:00
Charles Sun
61880591e9
[RLlib] Add DTTorchModel (#27872) 2022-08-16 18:18:29 -07:00
Charles Sun
753fad9cad
[RLlib] Add Segmentation Buffer for DT (#27829) 2022-08-16 15:20:41 -07:00
Jiajun Yao
c5a4605030
Fix grammer of error message (#27900)
Signed-off-by: Jiajun Yao <jeromeyjj@gmail.com>
2022-08-16 11:26:03 -07:00
Sven Mika
436c89ba1a
[RLlib] Eval workers use async req manager. (#27390) 2022-08-16 12:05:55 +02:00
kourosh hakhamaneshi
5520a96ce0
[RLlib] Fix get_init_state annotation in torch and define more specific TensorType. (#27791) 2022-08-11 20:02:17 +02:00
Artur Niederfahrenhorst
310ccdf5a3
[RLlib] Fix SAC config parameter that is not used. (#27741) 2022-08-11 18:57:55 +02:00
Artur Niederfahrenhorst
0dceddb912
[RLlib] Move learning_starts logic from buffers into training_step(). (#26032) 2022-08-11 13:07:30 +02:00
Artur Niederfahrenhorst
894e19f791
[RLlib] Dreamer's Episodic buffer should abide by ReplayBuffer API. (#27424) 2022-08-11 09:13:55 +02:00
Artur Niederfahrenhorst
04bc845360
[RLlib] Fix priority update for sequenced batches. (#27544) 2022-08-10 12:48:25 +02:00
kourosh hakhamaneshi
4607e788c1
[RLlib] Fix test_ope flakiness (#27676) 2022-08-09 16:12:30 -07:00
kourosh hakhamaneshi
3b3c20209b
[RLlib] Fix dqn reproducibility (#27459) 2022-08-09 15:56:44 -07:00
Charles Sun
c358305ca6
[RLlib] DatasetReader action normalization. (#27356) 2022-08-09 16:54:03 +02:00
Sven Mika
537f7c65c1
[RLlib] CRR framework torch by default. (#27161) 2022-08-09 16:53:00 +02:00
kourosh hakhamaneshi
b84dd38f01
[RLlib] Add __getitem__ to MultiAgentBatch to access policy_batches. (#27619) 2022-08-09 16:51:26 +02:00
kourosh hakhamaneshi
98b9fa6944
[RLlib] Hotfix for connector tests (#27654)
hot fix for rllib connector tests

Signed-off-by: Kourosh Hakhamaneshi <kourosh@anyscale.com>
2022-08-08 15:12:47 -07:00
Artur Niederfahrenhorst
4fe47d069f
[RLlib] Require ApeX LR schedule test to produce learner info. (#27557) 2022-08-08 18:19:02 +02:00
kourosh hakhamaneshi
3b2a8427af
[RLlib] Fix SampleBatch to_device(). (#27572) 2022-08-08 18:18:33 +02:00
Jun Gong
a61095a480
[RLlib] fix bandit pre-merge tests (#27554) 2022-08-07 17:48:29 -07:00
Jun Gong
5f07987ab1
[RLlib] Fix connector examples (#27583) 2022-08-07 17:48:09 -07:00
Jun Gong
f8b2128f16
[RLlib] async_request_test needs to run exclusively. (#27603) 2022-08-07 17:47:29 -07:00
Avnish Narayan
55209692ee
[RLlib] Deflake MARWIL and BC and remove memory leak from torch MARWIL policy (#27406) 2022-08-03 16:53:12 -07:00
Rohan Potdar
5b6a58ed28
[RLlib] Add OPE Learning Tests (#27154) 2022-08-02 17:51:38 -07:00
Avnish Narayan
00f9438101
[RLlib] Training step docs. (#27344) 2022-08-02 23:41:45 +02:00
Jun Gong
61add8ede6
[RLlib] Fix the last cartpole-crashing premerge test. (#27315) 2022-08-02 20:08:33 +02:00
kourosh hakhamaneshi
bda5026428
[RLlib] Fix A2C release tests (#27314) 2022-08-02 10:44:52 -07:00
kourosh hakhamaneshi
8d848890f1
[RLlib] Fix default view_requirement in policy.py (#27255) 2022-08-02 10:44:07 -07:00
Artur Niederfahrenhorst
a598458c46
[RLlib] Fix complex torch one-hot and flattened layers not being added to module list. (#27304) 2022-08-01 15:52:28 +02:00
Steven Morad
d0a8e3c36f
[RLlib] User-friendly RNN sequencing. (#27087) 2022-08-01 15:32:22 +02:00
Steven Morad
77318abfaf
[RLlib] Warn on PPO infinite KL loss term. (#26629) 2022-08-01 12:55:26 +02:00
Jun Gong
e6e10ce4cf
[RLlib] Revert 41c9ef70. (#27243)
Why are these changes needed?
Also:
Add validation to make sure multi-gpu and micro-batch is not used together.
Update A2C learning test to hit the microbatching branch.
Minor comment updates.
2022-07-29 11:05:15 -07:00
Kai Fricke
1d3c167bfe
[rllib/release] Fix rllib connect test with Tuner() API (#27155)
Currently failing because the Tune framework example does not return fitting results.

Signed-off-by: Kai Fricke <kai@anyscale.com>
2022-07-28 11:08:02 +01:00
Eric Liang
a4434fac7f
[docs] Fix the remaining style violations in docstrings and add lint rule (#27033) 2022-07-27 22:24:20 -07:00
xwjiang2010
eb69c1ca28
[air] Add annotation for Tune module. (#27060)
Co-authored-by: Kai Fricke <kai@anyscale.com>
2022-07-27 13:53:46 -07:00
Malinda
1d789aee63
[RLlib/Serve/Release tests] Few code refactoring for better use of efficient NumPy functions. (#26284) 2022-07-27 22:38:35 +02:00
Jun Gong
e1cf0cc982
[RLlib] Deflake cartpole crashing tests. (#27097)
Make sure cartpole crashing tests are not flaky.
2022-07-27 12:50:34 -07:00
Jun Gong
acf2bf9b2f
[RLlib] Get rid of all these deprecation warnings. (#27085) 2022-07-27 10:48:54 -07:00
Kai Fricke
8fda425eca
[tune/rllib] Hotfix ml_utils deprecation import error (#27095)
The changes conflicted with a recently merged PR that refactored the package structure (#27005).

Signed-off-by: Kai Fricke <kai@anyscale.com>
2022-07-27 16:11:58 +01:00
Kai Fricke
a5ea99cf95
[rfc] [tune/rllib] Fetch _progress_metrics from trainable for verbose=2 display (#26967)
RLLibs trainables produce a large number of metrics which makethe log output with verbose=2 illegible. This PR introduces a private `_progress_metrics` property for trainables. If set, the trial progress callback will only print these metrics per default, unless overridden e.g. with a custom `TrialProgressCallback`.
2022-07-27 16:04:23 +01:00
Amog Kamsetty
862d10c162
[AIR] Remove ML code from ray.util (#27005)
Removes all ML related code from `ray.util`

Removes:
- `ray.util.xgboost`
- `ray.util.lightgbm`
- `ray.util.horovod`
- `ray.util.ray_lightning`

Moves `ray.util.ml_utils` to other locations

Closes #23900

Signed-off-by: Amog Kamsetty <amogkamsetty@yahoo.com>
Signed-off-by: Kai Fricke <kai@anyscale.com>
Co-authored-by: Kai Fricke <kai@anyscale.com>
2022-07-27 14:24:19 +01:00
xwjiang2010
fcf897ee72
[air] update rllib example to use Tuner API. (#26987)
update rllib example to use Tuner API.

Signed-off-by: xwjiang2010 <xwjiang2010@gmail.com>
2022-07-27 12:12:59 +01:00
Jun Gong
c7ae787cc8
[RLlib] Beef up worker failure test. (#26953) 2022-07-27 00:10:45 -07:00
Jun Gong
a22457b548
[RLlib] Small bug fix (#27003) 2022-07-27 00:02:18 -07:00
Jun Gong
54df8bfe42
[RLlib] Try to checkpoint a durable policy name (#27016) 2022-07-27 00:01:14 -07:00