Commit graph

119 commits

Author SHA1 Message Date
Artur Niederfahrenhorst
51d16b8ff9
[RLlib] Test against failure of nodes, for example for practical use of spot instances. (#26676) 2022-08-29 14:37:56 +02:00
Artur Niederfahrenhorst
7ddd14b5db
[RLlib] Fix PPOTorchPolicy producing float metrics when not using critic. (#27980) 2022-08-22 09:41:36 -07:00
Jun Gong
ec38b96eba
[RLlib] quick fix for learning rate schedule for APPO algorithm. (#28013) 2022-08-19 14:34:34 -07:00
Charles Sun
edde905741
[RLlib] Add Decision Transformer (DT) (#27890) 2022-08-17 13:49:13 -07:00
Artur Niederfahrenhorst
f7b4c5a7ec
[RLlib] Remove unneeded args from offline learning examples. (#26666) 2022-08-17 17:59:27 +02:00
Charles Sun
9330d8f244
[RLlib] Add DTTorchPolicy (#27889) 2022-08-17 00:28:00 -07:00
Charles Sun
61880591e9
[RLlib] Add DTTorchModel (#27872) 2022-08-16 18:18:29 -07:00
Charles Sun
753fad9cad
[RLlib] Add Segmentation Buffer for DT (#27829) 2022-08-16 15:20:41 -07:00
Jiajun Yao
c5a4605030
Fix grammer of error message (#27900)
Signed-off-by: Jiajun Yao <jeromeyjj@gmail.com>
2022-08-16 11:26:03 -07:00
Sven Mika
436c89ba1a
[RLlib] Eval workers use async req manager. (#27390) 2022-08-16 12:05:55 +02:00
Artur Niederfahrenhorst
310ccdf5a3
[RLlib] Fix SAC config parameter that is not used. (#27741) 2022-08-11 18:57:55 +02:00
Artur Niederfahrenhorst
0dceddb912
[RLlib] Move learning_starts logic from buffers into training_step(). (#26032) 2022-08-11 13:07:30 +02:00
Artur Niederfahrenhorst
894e19f791
[RLlib] Dreamer's Episodic buffer should abide by ReplayBuffer API. (#27424) 2022-08-11 09:13:55 +02:00
kourosh hakhamaneshi
3b3c20209b
[RLlib] Fix dqn reproducibility (#27459) 2022-08-09 15:56:44 -07:00
Sven Mika
537f7c65c1
[RLlib] CRR framework torch by default. (#27161) 2022-08-09 16:53:00 +02:00
Artur Niederfahrenhorst
4fe47d069f
[RLlib] Require ApeX LR schedule test to produce learner info. (#27557) 2022-08-08 18:19:02 +02:00
Avnish Narayan
55209692ee
[RLlib] Deflake MARWIL and BC and remove memory leak from torch MARWIL policy (#27406) 2022-08-03 16:53:12 -07:00
kourosh hakhamaneshi
bda5026428
[RLlib] Fix A2C release tests (#27314) 2022-08-02 10:44:52 -07:00
Steven Morad
77318abfaf
[RLlib] Warn on PPO infinite KL loss term. (#26629) 2022-08-01 12:55:26 +02:00
Jun Gong
e6e10ce4cf
[RLlib] Revert 41c9ef70. (#27243)
Why are these changes needed?
Also:
Add validation to make sure multi-gpu and micro-batch is not used together.
Update A2C learning test to hit the microbatching branch.
Minor comment updates.
2022-07-29 11:05:15 -07:00
Jun Gong
acf2bf9b2f
[RLlib] Get rid of all these deprecation warnings. (#27085) 2022-07-27 10:48:54 -07:00
Kai Fricke
8fda425eca
[tune/rllib] Hotfix ml_utils deprecation import error (#27095)
The changes conflicted with a recently merged PR that refactored the package structure (#27005).

Signed-off-by: Kai Fricke <kai@anyscale.com>
2022-07-27 16:11:58 +01:00
Kai Fricke
a5ea99cf95
[rfc] [tune/rllib] Fetch _progress_metrics from trainable for verbose=2 display (#26967)
RLLibs trainables produce a large number of metrics which makethe log output with verbose=2 illegible. This PR introduces a private `_progress_metrics` property for trainables. If set, the trial progress callback will only print these metrics per default, unless overridden e.g. with a custom `TrialProgressCallback`.
2022-07-27 16:04:23 +01:00
Amog Kamsetty
862d10c162
[AIR] Remove ML code from ray.util (#27005)
Removes all ML related code from `ray.util`

Removes:
- `ray.util.xgboost`
- `ray.util.lightgbm`
- `ray.util.horovod`
- `ray.util.ray_lightning`

Moves `ray.util.ml_utils` to other locations

Closes #23900

Signed-off-by: Amog Kamsetty <amogkamsetty@yahoo.com>
Signed-off-by: Kai Fricke <kai@anyscale.com>
Co-authored-by: Kai Fricke <kai@anyscale.com>
2022-07-27 14:24:19 +01:00
Jun Gong
c7ae787cc8
[RLlib] Beef up worker failure test. (#26953) 2022-07-27 00:10:45 -07:00
Jun Gong
54df8bfe42
[RLlib] Try to checkpoint a durable policy name (#27016) 2022-07-27 00:01:14 -07:00
Jun Gong
ca5e0dcaf4
[RLLib] Record framework and algorithm used by an RLlib run. (#26956)
Automatically record framework and algorithm used by RLlib jobs.
For better planning.
2022-07-25 16:16:36 -07:00
kourosh hakhamaneshi
5030a4c1d3
[RLlib] Simplify agent collector (#26803) 2022-07-25 13:17:17 -07:00
Avnish Narayan
41c9ef709a
[RLlib] Using PG when not doing microbatching kills A2C performance. (#26844) 2022-07-25 15:11:26 +02:00
Artur Niederfahrenhorst
e9a8f7d9ae
[RLlib] Unify gnorm mixin for tf and torch policies. (#26102) 2022-07-24 15:31:09 +02:00
Rohan Potdar
a53bbe49bf
[RLlib]: Raise deprecation warning in MARWIL OPE methods. (#26893) 2022-07-23 13:55:40 +02:00
kourosh hakhamaneshi
aec79afda1
[RLlib] Fixes CRR flakeyness (#26770) 2022-07-20 12:08:57 -07:00
Avnish Narayan
9063cc9d5e
[RLlib] Fix memory leak in APEX_DQN (#26691) 2022-07-19 16:16:24 -07:00
Avnish Narayan
af41f21be0
[RLlib] Make queue placement ops blocking (#26581)
Signed-off-by: avnish avnish@anyscale.com

This change should fix issues with IMPALA and potentially APEX that stem from the various learner threads

Signed-off-by: avnish <avnish@anyscale.com>
2022-07-19 20:07:36 +01:00
Jun Gong
6b6d3017ba
[RLlib] more connector polishes and fixes. (#26645) 2022-07-19 08:50:28 -07:00
Rohan Potdar
38c9e1d52a
[RLlib]: Fix OPE trainables (#26279)
Co-authored-by: Kourosh Hakhamaneshi <kourosh@anyscale.com>
2022-07-17 14:25:53 -07:00
Sven Mika
4aea24c8a8
[RLlib] restart_failed_sub_environments now works for MA cases and crashes during reset(); +more tests and logging; add eval worker sub-env fault tolerance test. (#26276) 2022-07-15 08:55:14 +02:00
Jun Gong
104407a6e5
[RLlib] Fix all the erroneous on_trainer_init warning. (#26433) 2022-07-13 18:56:01 +02:00
Jun Gong
b383d987d1
[RLlib] Fix a bunch of issues related to connectors. (#26510) 2022-07-13 18:55:20 +02:00
Rohan Potdar
09ce4711fd
[RLlib]: Move OPE to evaluation config (#25911) 2022-07-12 11:04:34 -07:00
Avnish Narayan
1243ed62bf
[RLlib] Make Dataset reader default reader and enable CRR to use dataset (#26304)
Co-authored-by: avnish <avnish@avnishs-MBP.local.meter>
2022-07-08 12:43:35 -07:00
Sven Mika
ca913ff6d6
[RLlib] Eval WorkerSet crashes when trying to re-add a failed worker (eval set does not have local worker). (#26134) 2022-06-30 13:25:22 +02:00
Jun Gong
d83bbda281
[RLlib] Save serialized PolicySpec. Extract num_gpus related logics into a util function. (#25954) 2022-06-30 11:38:21 +02:00
Jun Gong
52bb8e47d4
[RLlib] EnvRunnerV2 and EpisodeV2 that support Connectors. (#25922) 2022-06-30 08:44:10 +02:00
Artur Niederfahrenhorst
ecd6047e39
Revert "[RLlib] Small Ape-X deflake. (#26078)" (#26191)
This reverts commit 11a549d4bd.
2022-06-29 10:25:47 -07:00
Artur Niederfahrenhorst
11a549d4bd
[RLlib] Small Ape-X deflake. (#26078) 2022-06-29 14:06:42 +02:00
Sven Mika
2b43713785
[RLlib] Move IMPALA and APPO back to exec plan (for now; due to unresolved learning/performance issues). (#25851) 2022-06-29 08:41:47 +02:00
Charles Sun
70f94e6d63
[RLlib] Migrating DDPG to PolicyV2. (#26054) 2022-06-28 15:52:56 +02:00
kourosh hakhamaneshi
f421730b47
[RLlib] Added expectation advantage_type option to CRR. (#26142) 2022-06-28 15:40:09 +02:00
Sven Mika
762cfbdff1
[RLlib] IMPALA and APPO metrics fixes; remove deprecated async_parallel_requests utility. (#26117) 2022-06-28 15:14:37 +02:00