Commit graph

105 commits

Author SHA1 Message Date
Sven Mika
537f7c65c1
[RLlib] CRR framework torch by default. (#27161) 2022-08-09 16:53:00 +02:00
Artur Niederfahrenhorst
4fe47d069f
[RLlib] Require ApeX LR schedule test to produce learner info. (#27557) 2022-08-08 18:19:02 +02:00
Avnish Narayan
55209692ee
[RLlib] Deflake MARWIL and BC and remove memory leak from torch MARWIL policy (#27406) 2022-08-03 16:53:12 -07:00
kourosh hakhamaneshi
bda5026428
[RLlib] Fix A2C release tests (#27314) 2022-08-02 10:44:52 -07:00
Steven Morad
77318abfaf
[RLlib] Warn on PPO infinite KL loss term. (#26629) 2022-08-01 12:55:26 +02:00
Jun Gong
e6e10ce4cf
[RLlib] Revert 41c9ef70. (#27243)
Why are these changes needed?
Also:
Add validation to make sure multi-gpu and micro-batch is not used together.
Update A2C learning test to hit the microbatching branch.
Minor comment updates.
2022-07-29 11:05:15 -07:00
Jun Gong
acf2bf9b2f
[RLlib] Get rid of all these deprecation warnings. (#27085) 2022-07-27 10:48:54 -07:00
Kai Fricke
8fda425eca
[tune/rllib] Hotfix ml_utils deprecation import error (#27095)
The changes conflicted with a recently merged PR that refactored the package structure (#27005).

Signed-off-by: Kai Fricke <kai@anyscale.com>
2022-07-27 16:11:58 +01:00
Kai Fricke
a5ea99cf95
[rfc] [tune/rllib] Fetch _progress_metrics from trainable for verbose=2 display (#26967)
RLLibs trainables produce a large number of metrics which makethe log output with verbose=2 illegible. This PR introduces a private `_progress_metrics` property for trainables. If set, the trial progress callback will only print these metrics per default, unless overridden e.g. with a custom `TrialProgressCallback`.
2022-07-27 16:04:23 +01:00
Amog Kamsetty
862d10c162
[AIR] Remove ML code from ray.util (#27005)
Removes all ML related code from `ray.util`

Removes:
- `ray.util.xgboost`
- `ray.util.lightgbm`
- `ray.util.horovod`
- `ray.util.ray_lightning`

Moves `ray.util.ml_utils` to other locations

Closes #23900

Signed-off-by: Amog Kamsetty <amogkamsetty@yahoo.com>
Signed-off-by: Kai Fricke <kai@anyscale.com>
Co-authored-by: Kai Fricke <kai@anyscale.com>
2022-07-27 14:24:19 +01:00
Jun Gong
c7ae787cc8
[RLlib] Beef up worker failure test. (#26953) 2022-07-27 00:10:45 -07:00
Jun Gong
54df8bfe42
[RLlib] Try to checkpoint a durable policy name (#27016) 2022-07-27 00:01:14 -07:00
Jun Gong
ca5e0dcaf4
[RLLib] Record framework and algorithm used by an RLlib run. (#26956)
Automatically record framework and algorithm used by RLlib jobs.
For better planning.
2022-07-25 16:16:36 -07:00
kourosh hakhamaneshi
5030a4c1d3
[RLlib] Simplify agent collector (#26803) 2022-07-25 13:17:17 -07:00
Avnish Narayan
41c9ef709a
[RLlib] Using PG when not doing microbatching kills A2C performance. (#26844) 2022-07-25 15:11:26 +02:00
Artur Niederfahrenhorst
e9a8f7d9ae
[RLlib] Unify gnorm mixin for tf and torch policies. (#26102) 2022-07-24 15:31:09 +02:00
Rohan Potdar
a53bbe49bf
[RLlib]: Raise deprecation warning in MARWIL OPE methods. (#26893) 2022-07-23 13:55:40 +02:00
kourosh hakhamaneshi
aec79afda1
[RLlib] Fixes CRR flakeyness (#26770) 2022-07-20 12:08:57 -07:00
Avnish Narayan
9063cc9d5e
[RLlib] Fix memory leak in APEX_DQN (#26691) 2022-07-19 16:16:24 -07:00
Avnish Narayan
af41f21be0
[RLlib] Make queue placement ops blocking (#26581)
Signed-off-by: avnish avnish@anyscale.com

This change should fix issues with IMPALA and potentially APEX that stem from the various learner threads

Signed-off-by: avnish <avnish@anyscale.com>
2022-07-19 20:07:36 +01:00
Jun Gong
6b6d3017ba
[RLlib] more connector polishes and fixes. (#26645) 2022-07-19 08:50:28 -07:00
Rohan Potdar
38c9e1d52a
[RLlib]: Fix OPE trainables (#26279)
Co-authored-by: Kourosh Hakhamaneshi <kourosh@anyscale.com>
2022-07-17 14:25:53 -07:00
Sven Mika
4aea24c8a8
[RLlib] restart_failed_sub_environments now works for MA cases and crashes during reset(); +more tests and logging; add eval worker sub-env fault tolerance test. (#26276) 2022-07-15 08:55:14 +02:00
Jun Gong
104407a6e5
[RLlib] Fix all the erroneous on_trainer_init warning. (#26433) 2022-07-13 18:56:01 +02:00
Jun Gong
b383d987d1
[RLlib] Fix a bunch of issues related to connectors. (#26510) 2022-07-13 18:55:20 +02:00
Rohan Potdar
09ce4711fd
[RLlib]: Move OPE to evaluation config (#25911) 2022-07-12 11:04:34 -07:00
Avnish Narayan
1243ed62bf
[RLlib] Make Dataset reader default reader and enable CRR to use dataset (#26304)
Co-authored-by: avnish <avnish@avnishs-MBP.local.meter>
2022-07-08 12:43:35 -07:00
Sven Mika
ca913ff6d6
[RLlib] Eval WorkerSet crashes when trying to re-add a failed worker (eval set does not have local worker). (#26134) 2022-06-30 13:25:22 +02:00
Jun Gong
d83bbda281
[RLlib] Save serialized PolicySpec. Extract num_gpus related logics into a util function. (#25954) 2022-06-30 11:38:21 +02:00
Jun Gong
52bb8e47d4
[RLlib] EnvRunnerV2 and EpisodeV2 that support Connectors. (#25922) 2022-06-30 08:44:10 +02:00
Artur Niederfahrenhorst
ecd6047e39
Revert "[RLlib] Small Ape-X deflake. (#26078)" (#26191)
This reverts commit 11a549d4bd.
2022-06-29 10:25:47 -07:00
Artur Niederfahrenhorst
11a549d4bd
[RLlib] Small Ape-X deflake. (#26078) 2022-06-29 14:06:42 +02:00
Sven Mika
2b43713785
[RLlib] Move IMPALA and APPO back to exec plan (for now; due to unresolved learning/performance issues). (#25851) 2022-06-29 08:41:47 +02:00
Charles Sun
70f94e6d63
[RLlib] Migrating DDPG to PolicyV2. (#26054) 2022-06-28 15:52:56 +02:00
kourosh hakhamaneshi
f421730b47
[RLlib] Added expectation advantage_type option to CRR. (#26142) 2022-06-28 15:40:09 +02:00
Sven Mika
762cfbdff1
[RLlib] IMPALA and APPO metrics fixes; remove deprecated async_parallel_requests utility. (#26117) 2022-06-28 15:14:37 +02:00
Artur Niederfahrenhorst
efea87f0cb
[RLlib] SimpleQ PyTorch Multi GPU fix (#26109) 2022-06-28 12:12:56 +02:00
Artur Niederfahrenhorst
bed9083f35
[RLlib] Add timeout to filter synchronization. (#25959) 2022-06-24 14:37:43 +02:00
Artur Niederfahrenhorst
a3f1323457
[RLlib] Make QMix use the ReplayBufferAPI (#25560) 2022-06-23 22:55:22 -07:00
Sven Mika
59a967a3a0
[RLlib] Cleanup some deprecated metric keys and classes. (#26036) 2022-06-23 21:30:01 +02:00
Kai Fricke
8a2f6bda62
[tune/structure] Introduce experiment package (#26033)
Experiment, Trial, and config parsing moves into an `experiment` package.

Notably, the new public facing APIs will be

```
from ray.tune.experiment import Experiment
from ray.tune.experiment import Trial
```
2022-06-23 14:52:46 +01:00
Kai Fricke
0959f44b6f
[tune/structure] Introduce execution package (#26015)
Execution-specific packages are moved to tune.execution.

Co-authored-by: Xiaowei Jiang <xwjiang2010@gmail.com>
2022-06-23 11:13:19 +01:00
Avnish Narayan
871aef80dc
[RLlib] Aggregate Impala learner info. (#25856) 2022-06-22 09:43:10 +02:00
Artur Niederfahrenhorst
dcbc225728
[RLlib] Fix DDPG test ignoring framework_iterator-modified config. (#25913) 2022-06-21 16:17:42 +02:00
Avnish Narayan
d859b84058
[RLlib] Add compute log likelihoods test for CRR. (#25905) 2022-06-21 16:06:10 +02:00
Sven Mika
1499af945b
[RLlib] Algorithm step() fixes: evaluation should NOT be part of timed training_step loop. (#25924) 2022-06-20 19:53:47 +02:00
Sven Mika
96693055bd
[RLlib] More Trainer -> Algorithm renaming cleanups. (#25869) 2022-06-20 15:54:00 +02:00
Sven Mika
d90c6cfbd6
[RLlib] SimpleQ PolicyV2 (sub-classing). (#25871) 2022-06-17 20:12:16 +02:00
Fabian Witter
fcdf710574
[RLlib] Move offline input into replay buffer using rollout ops in CQL. (#25629) 2022-06-17 17:08:55 +02:00
Artur Niederfahrenhorst
a322cc5765
[RLlib] IMPALA/APPO multi-agent mix-in-buffer fixes (plus MA learning tests). (#25848) 2022-06-17 14:10:36 +02:00