Artur Niederfahrenhorst
51d16b8ff9
[RLlib] Test against failure of nodes, for example for practical use of spot instances. ( #26676 )
2022-08-29 14:37:56 +02:00
Artur Niederfahrenhorst
7ddd14b5db
[RLlib] Fix PPOTorchPolicy producing float metrics when not using critic. ( #27980 )
2022-08-22 09:41:36 -07:00
Jun Gong
ec38b96eba
[RLlib] quick fix for learning rate schedule for APPO algorithm. ( #28013 )
2022-08-19 14:34:34 -07:00
Charles Sun
edde905741
[RLlib] Add Decision Transformer (DT) ( #27890 )
2022-08-17 13:49:13 -07:00
Artur Niederfahrenhorst
f7b4c5a7ec
[RLlib] Remove unneeded args from offline learning examples. ( #26666 )
2022-08-17 17:59:27 +02:00
Charles Sun
9330d8f244
[RLlib] Add DTTorchPolicy ( #27889 )
2022-08-17 00:28:00 -07:00
Charles Sun
61880591e9
[RLlib] Add DTTorchModel ( #27872 )
2022-08-16 18:18:29 -07:00
Charles Sun
753fad9cad
[RLlib] Add Segmentation Buffer for DT ( #27829 )
2022-08-16 15:20:41 -07:00
Jiajun Yao
c5a4605030
Fix grammer of error message ( #27900 )
...
Signed-off-by: Jiajun Yao <jeromeyjj@gmail.com>
2022-08-16 11:26:03 -07:00
Sven Mika
436c89ba1a
[RLlib] Eval workers use async req manager. ( #27390 )
2022-08-16 12:05:55 +02:00
Artur Niederfahrenhorst
310ccdf5a3
[RLlib] Fix SAC config parameter that is not used. ( #27741 )
2022-08-11 18:57:55 +02:00
Artur Niederfahrenhorst
0dceddb912
[RLlib] Move learning_starts logic from buffers into training_step()
. ( #26032 )
2022-08-11 13:07:30 +02:00
Artur Niederfahrenhorst
894e19f791
[RLlib] Dreamer's Episodic buffer should abide by ReplayBuffer API. ( #27424 )
2022-08-11 09:13:55 +02:00
kourosh hakhamaneshi
3b3c20209b
[RLlib] Fix dqn reproducibility ( #27459 )
2022-08-09 15:56:44 -07:00
Sven Mika
537f7c65c1
[RLlib] CRR framework torch by default. ( #27161 )
2022-08-09 16:53:00 +02:00
Artur Niederfahrenhorst
4fe47d069f
[RLlib] Require ApeX LR schedule test to produce learner info. ( #27557 )
2022-08-08 18:19:02 +02:00
Avnish Narayan
55209692ee
[RLlib] Deflake MARWIL and BC and remove memory leak from torch MARWIL policy ( #27406 )
2022-08-03 16:53:12 -07:00
kourosh hakhamaneshi
bda5026428
[RLlib] Fix A2C release tests ( #27314 )
2022-08-02 10:44:52 -07:00
Steven Morad
77318abfaf
[RLlib] Warn on PPO infinite KL loss term. ( #26629 )
2022-08-01 12:55:26 +02:00
Jun Gong
e6e10ce4cf
[RLlib] Revert 41c9ef70
. ( #27243 )
...
Why are these changes needed?
Also:
Add validation to make sure multi-gpu and micro-batch is not used together.
Update A2C learning test to hit the microbatching branch.
Minor comment updates.
2022-07-29 11:05:15 -07:00
Jun Gong
acf2bf9b2f
[RLlib] Get rid of all these deprecation warnings. ( #27085 )
2022-07-27 10:48:54 -07:00
Kai Fricke
8fda425eca
[tune/rllib] Hotfix ml_utils deprecation import error ( #27095 )
...
The changes conflicted with a recently merged PR that refactored the package structure (#27005 ).
Signed-off-by: Kai Fricke <kai@anyscale.com>
2022-07-27 16:11:58 +01:00
Kai Fricke
a5ea99cf95
[rfc] [tune/rllib] Fetch _progress_metrics from trainable for verbose=2 display ( #26967 )
...
RLLibs trainables produce a large number of metrics which makethe log output with verbose=2 illegible. This PR introduces a private `_progress_metrics` property for trainables. If set, the trial progress callback will only print these metrics per default, unless overridden e.g. with a custom `TrialProgressCallback`.
2022-07-27 16:04:23 +01:00
Amog Kamsetty
862d10c162
[AIR] Remove ML code from ray.util
( #27005 )
...
Removes all ML related code from `ray.util`
Removes:
- `ray.util.xgboost`
- `ray.util.lightgbm`
- `ray.util.horovod`
- `ray.util.ray_lightning`
Moves `ray.util.ml_utils` to other locations
Closes #23900
Signed-off-by: Amog Kamsetty <amogkamsetty@yahoo.com>
Signed-off-by: Kai Fricke <kai@anyscale.com>
Co-authored-by: Kai Fricke <kai@anyscale.com>
2022-07-27 14:24:19 +01:00
Jun Gong
c7ae787cc8
[RLlib] Beef up worker failure test. ( #26953 )
2022-07-27 00:10:45 -07:00
Jun Gong
54df8bfe42
[RLlib] Try to checkpoint a durable policy name ( #27016 )
2022-07-27 00:01:14 -07:00
Jun Gong
ca5e0dcaf4
[RLLib] Record framework and algorithm used by an RLlib run. ( #26956 )
...
Automatically record framework and algorithm used by RLlib jobs.
For better planning.
2022-07-25 16:16:36 -07:00
kourosh hakhamaneshi
5030a4c1d3
[RLlib] Simplify agent collector ( #26803 )
2022-07-25 13:17:17 -07:00
Avnish Narayan
41c9ef709a
[RLlib] Using PG when not doing microbatching kills A2C performance. ( #26844 )
2022-07-25 15:11:26 +02:00
Artur Niederfahrenhorst
e9a8f7d9ae
[RLlib] Unify gnorm mixin for tf and torch policies. ( #26102 )
2022-07-24 15:31:09 +02:00
Rohan Potdar
a53bbe49bf
[RLlib]: Raise deprecation warning in MARWIL OPE methods. ( #26893 )
2022-07-23 13:55:40 +02:00
kourosh hakhamaneshi
aec79afda1
[RLlib] Fixes CRR flakeyness ( #26770 )
2022-07-20 12:08:57 -07:00
Avnish Narayan
9063cc9d5e
[RLlib] Fix memory leak in APEX_DQN ( #26691 )
2022-07-19 16:16:24 -07:00
Avnish Narayan
af41f21be0
[RLlib] Make queue placement ops blocking ( #26581 )
...
Signed-off-by: avnish avnish@anyscale.com
This change should fix issues with IMPALA and potentially APEX that stem from the various learner threads
Signed-off-by: avnish <avnish@anyscale.com>
2022-07-19 20:07:36 +01:00
Jun Gong
6b6d3017ba
[RLlib] more connector polishes and fixes. ( #26645 )
2022-07-19 08:50:28 -07:00
Rohan Potdar
38c9e1d52a
[RLlib]: Fix OPE trainables ( #26279 )
...
Co-authored-by: Kourosh Hakhamaneshi <kourosh@anyscale.com>
2022-07-17 14:25:53 -07:00
Sven Mika
4aea24c8a8
[RLlib] restart_failed_sub_environments
now works for MA cases and crashes during reset()
; +more tests and logging; add eval worker sub-env fault tolerance test. ( #26276 )
2022-07-15 08:55:14 +02:00
Jun Gong
104407a6e5
[RLlib] Fix all the erroneous on_trainer_init
warning. ( #26433 )
2022-07-13 18:56:01 +02:00
Jun Gong
b383d987d1
[RLlib] Fix a bunch of issues related to connectors. ( #26510 )
2022-07-13 18:55:20 +02:00
Rohan Potdar
09ce4711fd
[RLlib]: Move OPE to evaluation config ( #25911 )
2022-07-12 11:04:34 -07:00
Avnish Narayan
1243ed62bf
[RLlib] Make Dataset reader default reader and enable CRR to use dataset ( #26304 )
...
Co-authored-by: avnish <avnish@avnishs-MBP.local.meter>
2022-07-08 12:43:35 -07:00
Sven Mika
ca913ff6d6
[RLlib] Eval WorkerSet crashes when trying to re-add a failed worker (eval set does not have local worker). ( #26134 )
2022-06-30 13:25:22 +02:00
Jun Gong
d83bbda281
[RLlib] Save serialized PolicySpec. Extract num_gpus
related logics into a util function. ( #25954 )
2022-06-30 11:38:21 +02:00
Jun Gong
52bb8e47d4
[RLlib] EnvRunnerV2 and EpisodeV2 that support Connectors. ( #25922 )
2022-06-30 08:44:10 +02:00
Artur Niederfahrenhorst
ecd6047e39
Revert "[RLlib] Small Ape-X deflake. ( #26078 )" ( #26191 )
...
This reverts commit 11a549d4bd
.
2022-06-29 10:25:47 -07:00
Artur Niederfahrenhorst
11a549d4bd
[RLlib] Small Ape-X deflake. ( #26078 )
2022-06-29 14:06:42 +02:00
Sven Mika
2b43713785
[RLlib] Move IMPALA and APPO back to exec plan (for now; due to unresolved learning/performance issues). ( #25851 )
2022-06-29 08:41:47 +02:00
Charles Sun
70f94e6d63
[RLlib] Migrating DDPG to PolicyV2. ( #26054 )
2022-06-28 15:52:56 +02:00
kourosh hakhamaneshi
f421730b47
[RLlib] Added expectation
advantage_type option to CRR. ( #26142 )
2022-06-28 15:40:09 +02:00
Sven Mika
762cfbdff1
[RLlib] IMPALA and APPO metrics fixes; remove deprecated async_parallel_requests
utility. ( #26117 )
2022-06-28 15:14:37 +02:00