hiro/ray - Forgejo: Beyond coding. We Forge.

hiro/ray

mirror of https://github.com/vale981/ray synced 2025-03-04 17:41:43 -05:00

Author	SHA1	Message	Date
Artur Niederfahrenhorst	51d16b8ff9	[RLlib] Test against failure of nodes, for example for practical use of spot instances. (#26676 )	2022-08-29 14:37:56 +02:00
Artur Niederfahrenhorst	7ddd14b5db	[RLlib] Fix PPOTorchPolicy producing float metrics when not using critic. (#27980 )	2022-08-22 09:41:36 -07:00
Jun Gong	ec38b96eba	[RLlib] quick fix for learning rate schedule for APPO algorithm. (#28013 )	2022-08-19 14:34:34 -07:00
Charles Sun	edde905741	[RLlib] Add Decision Transformer (DT) (#27890 )	2022-08-17 13:49:13 -07:00
Artur Niederfahrenhorst	f7b4c5a7ec	[RLlib] Remove unneeded args from offline learning examples. (#26666 )	2022-08-17 17:59:27 +02:00
Charles Sun	9330d8f244	[RLlib] Add DTTorchPolicy (#27889 )	2022-08-17 00:28:00 -07:00
Charles Sun	61880591e9	[RLlib] Add DTTorchModel (#27872 )	2022-08-16 18:18:29 -07:00
Charles Sun	753fad9cad	[RLlib] Add Segmentation Buffer for DT (#27829 )	2022-08-16 15:20:41 -07:00
Jiajun Yao	c5a4605030	Fix grammer of error message (#27900 ) Signed-off-by: Jiajun Yao <jeromeyjj@gmail.com>	2022-08-16 11:26:03 -07:00
Sven Mika	436c89ba1a	[RLlib] Eval workers use async req manager. (#27390 )	2022-08-16 12:05:55 +02:00
Artur Niederfahrenhorst	310ccdf5a3	[RLlib] Fix SAC config parameter that is not used. (#27741 )	2022-08-11 18:57:55 +02:00
Artur Niederfahrenhorst	0dceddb912	[RLlib] Move learning_starts logic from buffers into `training_step()`. (#26032 )	2022-08-11 13:07:30 +02:00
Artur Niederfahrenhorst	894e19f791	[RLlib] Dreamer's Episodic buffer should abide by ReplayBuffer API. (#27424 )	2022-08-11 09:13:55 +02:00
kourosh hakhamaneshi	3b3c20209b	[RLlib] Fix dqn reproducibility (#27459 )	2022-08-09 15:56:44 -07:00
Sven Mika	537f7c65c1	[RLlib] CRR framework torch by default. (#27161 )	2022-08-09 16:53:00 +02:00
Artur Niederfahrenhorst	4fe47d069f	[RLlib] Require ApeX LR schedule test to produce learner info. (#27557 )	2022-08-08 18:19:02 +02:00
Avnish Narayan	55209692ee	[RLlib] Deflake MARWIL and BC and remove memory leak from torch MARWIL policy (#27406 )	2022-08-03 16:53:12 -07:00
kourosh hakhamaneshi	bda5026428	[RLlib] Fix A2C release tests (#27314 )	2022-08-02 10:44:52 -07:00
Steven Morad	77318abfaf	[RLlib] Warn on PPO infinite KL loss term. (#26629 )	2022-08-01 12:55:26 +02:00
Jun Gong	e6e10ce4cf	[RLlib] Revert `41c9ef70`. (#27243 ) Why are these changes needed? Also: Add validation to make sure multi-gpu and micro-batch is not used together. Update A2C learning test to hit the microbatching branch. Minor comment updates.	2022-07-29 11:05:15 -07:00
Jun Gong	acf2bf9b2f	[RLlib] Get rid of all these deprecation warnings. (#27085 )	2022-07-27 10:48:54 -07:00
Kai Fricke	8fda425eca	[tune/rllib] Hotfix ml_utils deprecation import error (#27095 ) The changes conflicted with a recently merged PR that refactored the package structure (#27005). Signed-off-by: Kai Fricke <kai@anyscale.com>	2022-07-27 16:11:58 +01:00
Kai Fricke	a5ea99cf95	[rfc] [tune/rllib] Fetch _progress_metrics from trainable for verbose=2 display (#26967 ) RLLibs trainables produce a large number of metrics which makethe log output with verbose=2 illegible. This PR introduces a private `_progress_metrics` property for trainables. If set, the trial progress callback will only print these metrics per default, unless overridden e.g. with a custom `TrialProgressCallback`.	2022-07-27 16:04:23 +01:00
Amog Kamsetty	862d10c162	[AIR] Remove ML code from `ray.util` (#27005 ) Removes all ML related code from `ray.util` Removes: - `ray.util.xgboost` - `ray.util.lightgbm` - `ray.util.horovod` - `ray.util.ray_lightning` Moves `ray.util.ml_utils` to other locations Closes #23900 Signed-off-by: Amog Kamsetty <amogkamsetty@yahoo.com> Signed-off-by: Kai Fricke <kai@anyscale.com> Co-authored-by: Kai Fricke <kai@anyscale.com>	2022-07-27 14:24:19 +01:00
Jun Gong	c7ae787cc8	[RLlib] Beef up worker failure test. (#26953 )	2022-07-27 00:10:45 -07:00
Jun Gong	54df8bfe42	[RLlib] Try to checkpoint a durable policy name (#27016 )	2022-07-27 00:01:14 -07:00
Jun Gong	ca5e0dcaf4	[RLLib] Record framework and algorithm used by an RLlib run. (#26956 ) Automatically record framework and algorithm used by RLlib jobs. For better planning.	2022-07-25 16:16:36 -07:00
kourosh hakhamaneshi	5030a4c1d3	[RLlib] Simplify agent collector (#26803 )	2022-07-25 13:17:17 -07:00
Avnish Narayan	41c9ef709a	[RLlib] Using PG when not doing microbatching kills A2C performance. (#26844 )	2022-07-25 15:11:26 +02:00
Artur Niederfahrenhorst	e9a8f7d9ae	[RLlib] Unify gnorm mixin for tf and torch policies. (#26102 )	2022-07-24 15:31:09 +02:00
Rohan Potdar	a53bbe49bf	[RLlib]: Raise deprecation warning in MARWIL OPE methods. (#26893 )	2022-07-23 13:55:40 +02:00
kourosh hakhamaneshi	aec79afda1	[RLlib] Fixes CRR flakeyness (#26770 )	2022-07-20 12:08:57 -07:00
Avnish Narayan	9063cc9d5e	[RLlib] Fix memory leak in APEX_DQN (#26691 )	2022-07-19 16:16:24 -07:00
Avnish Narayan	af41f21be0	[RLlib] Make queue placement ops blocking (#26581 ) Signed-off-by: avnish avnish@anyscale.com This change should fix issues with IMPALA and potentially APEX that stem from the various learner threads Signed-off-by: avnish <avnish@anyscale.com>	2022-07-19 20:07:36 +01:00
Jun Gong	6b6d3017ba	[RLlib] more connector polishes and fixes. (#26645 )	2022-07-19 08:50:28 -07:00
Rohan Potdar	38c9e1d52a	[RLlib]: Fix OPE trainables (#26279 ) Co-authored-by: Kourosh Hakhamaneshi <kourosh@anyscale.com>	2022-07-17 14:25:53 -07:00
Sven Mika	4aea24c8a8	[RLlib] `restart_failed_sub_environments` now works for MA cases and crashes during `reset()`; +more tests and logging; add eval worker sub-env fault tolerance test. (#26276 )	2022-07-15 08:55:14 +02:00
Jun Gong	104407a6e5	[RLlib] Fix all the erroneous `on_trainer_init` warning. (#26433 )	2022-07-13 18:56:01 +02:00
Jun Gong	b383d987d1	[RLlib] Fix a bunch of issues related to connectors. (#26510 )	2022-07-13 18:55:20 +02:00
Rohan Potdar	09ce4711fd	[RLlib]: Move OPE to evaluation config (#25911 )	2022-07-12 11:04:34 -07:00
Avnish Narayan	1243ed62bf	[RLlib] Make Dataset reader default reader and enable CRR to use dataset (#26304 ) Co-authored-by: avnish <avnish@avnishs-MBP.local.meter>	2022-07-08 12:43:35 -07:00
Sven Mika	ca913ff6d6	[RLlib] Eval WorkerSet crashes when trying to re-add a failed worker (eval set does not have local worker). (#26134 )	2022-06-30 13:25:22 +02:00
Jun Gong	d83bbda281	[RLlib] Save serialized PolicySpec. Extract `num_gpus` related logics into a util function. (#25954 )	2022-06-30 11:38:21 +02:00
Jun Gong	52bb8e47d4	[RLlib] EnvRunnerV2 and EpisodeV2 that support Connectors. (#25922 )	2022-06-30 08:44:10 +02:00
Artur Niederfahrenhorst	ecd6047e39	Revert "[RLlib] Small Ape-X deflake. (#26078 )" (#26191 ) This reverts commit `11a549d4bd`.	2022-06-29 10:25:47 -07:00
Artur Niederfahrenhorst	11a549d4bd	[RLlib] Small Ape-X deflake. (#26078 )	2022-06-29 14:06:42 +02:00
Sven Mika	2b43713785	[RLlib] Move IMPALA and APPO back to exec plan (for now; due to unresolved learning/performance issues). (#25851 )	2022-06-29 08:41:47 +02:00
Charles Sun	70f94e6d63	[RLlib] Migrating DDPG to PolicyV2. (#26054 )	2022-06-28 15:52:56 +02:00
kourosh hakhamaneshi	f421730b47	[RLlib] Added `expectation` advantage_type option to CRR. (#26142 )	2022-06-28 15:40:09 +02:00
Sven Mika	762cfbdff1	[RLlib] IMPALA and APPO metrics fixes; remove deprecated `async_parallel_requests` utility. (#26117 )	2022-06-28 15:14:37 +02:00

1 2 3

119 commits