hiro/ray - Forgejo: Beyond coding. We Forge.

hiro/ray

mirror of https://github.com/vale981/ray synced 2025-03-05 18:11:42 -05:00

Author	SHA1	Message	Date
Artur Niederfahrenhorst	56e7800e0b	[RLlib] Tolerate nan metrics in LearnerInfoBuilder. (#27981 )	2022-08-23 10:07:32 -07:00
Artur Niederfahrenhorst	7ddd14b5db	[RLlib] Fix PPOTorchPolicy producing float metrics when not using critic. (#27980 )	2022-08-22 09:41:36 -07:00
Jun Gong	62b91cbec0	[docs][rllib] Documentation for connectors. (#27528 ) Co-authored-by: Kourosh Hakhamaneshi <kourosh@anyscale.com> Co-authored-by: Richard Liaw <rliaw@berkeley.edu>	2022-08-19 14:35:07 -07:00
kourosh hakhamaneshi	5520a96ce0	[RLlib] Fix `get_init_state` annotation in torch and define more specific `TensorType`. (#27791 )	2022-08-11 20:02:17 +02:00
Artur Niederfahrenhorst	0dceddb912	[RLlib] Move learning_starts logic from buffers into `training_step()`. (#26032 )	2022-08-11 13:07:30 +02:00
Artur Niederfahrenhorst	894e19f791	[RLlib] Dreamer's Episodic buffer should abide by ReplayBuffer API. (#27424 )	2022-08-11 09:13:55 +02:00
Artur Niederfahrenhorst	04bc845360	[RLlib] Fix priority update for sequenced batches. (#27544 )	2022-08-10 12:48:25 +02:00
kourosh hakhamaneshi	3b3c20209b	[RLlib] Fix dqn reproducibility (#27459 )	2022-08-09 15:56:44 -07:00
kourosh hakhamaneshi	3b2a8427af	[RLlib] Fix SampleBatch to_device(). (#27572 )	2022-08-08 18:18:33 +02:00
Jun Gong	5f07987ab1	[RLlib] Fix connector examples (#27583 )	2022-08-07 17:48:09 -07:00
Rohan Potdar	5b6a58ed28	[RLlib] Add OPE Learning Tests (#27154 )	2022-08-02 17:51:38 -07:00
Steven Morad	77318abfaf	[RLlib] Warn on PPO infinite KL loss term. (#26629 )	2022-08-01 12:55:26 +02:00
Eric Liang	a4434fac7f	[docs] Fix the remaining style violations in docstrings and add lint rule (#27033 )	2022-07-27 22:24:20 -07:00
Jun Gong	acf2bf9b2f	[RLlib] Get rid of all these deprecation warnings. (#27085 )	2022-07-27 10:48:54 -07:00
xwjiang2010	fcf897ee72	[air] update rllib example to use Tuner API. (#26987 ) update rllib example to use Tuner API. Signed-off-by: xwjiang2010 <xwjiang2010@gmail.com>	2022-07-27 12:12:59 +01:00
kourosh hakhamaneshi	5030a4c1d3	[RLlib] Simplify agent collector (#26803 )	2022-07-25 13:17:17 -07:00
Artur Niederfahrenhorst	e9a8f7d9ae	[RLlib] Unify gnorm mixin for tf and torch policies. (#26102 )	2022-07-24 15:31:09 +02:00
Ishant Mrinal	b32c784c7f	[RLLib] RE3 exploration algorithm TF2 framework support (#25221 )	2022-07-23 18:05:01 -07:00
Rohan Potdar	97bcf38ec0	[RLlib] Fix torch None conversion in `torch_utils.py::convert_to_torch_tensor`. (#26863 )	2022-07-23 13:54:57 +02:00
Steven Morad	259429bdc3	Bump gym dep to 0.24 (#26190 ) Co-authored-by: Steven Morad <smorad@anyscale.com> Co-authored-by: Avnish <avnishnarayan@gmail.com> Co-authored-by: Avnish Narayan <38871737+avnishn@users.noreply.github.com>	2022-07-22 12:37:16 -07:00
Olaf Lipinski	8271406a04	[RLLib] Fix MultiDiscrete not being one-hotted correctly (#26558 ) Co-authored-by: Jun Gong <jungong@anyscale.com>	2022-07-20 15:25:53 -07:00
Jun Gong	6b6d3017ba	[RLlib] more connector polishes and fixes. (#26645 )	2022-07-19 08:50:28 -07:00
Artur Niederfahrenhorst	0ce3bc5e48	[RLlib] Add/reorder Args of Prioritized/MixIn MultiAgentReplayBuffer. (#26428 )	2022-07-18 18:04:03 +02:00
Rohan Potdar	38c9e1d52a	[RLlib]: Fix OPE trainables (#26279 ) Co-authored-by: Kourosh Hakhamaneshi <kourosh@anyscale.com>	2022-07-17 14:25:53 -07:00
mgerstgrasser	f0e9d1a9bb	[RLlib] In env check, step only expected agents. (#26425 )	2022-07-15 09:16:09 +02:00
Sven Mika	4aea24c8a8	[RLlib] `restart_failed_sub_environments` now works for MA cases and crashes during `reset()`; +more tests and logging; add eval worker sub-env fault tolerance test. (#26276 )	2022-07-15 08:55:14 +02:00
Jun Gong	b383d987d1	[RLlib] Fix a bunch of issues related to connectors. (#26510 )	2022-07-13 18:55:20 +02:00
Avnish Narayan	5df66b917d	[Lint Check] Remove broken link (#26505 ) The paper is not available anymore.	2022-07-13 10:30:20 +01:00
Jun Gong	0c469e490e	[RLlib] Checkpoint and restore connectors. (#26253 )	2022-07-09 01:06:24 -07:00
Jun Gong	d234348bd2	[RLlib] Minor simplification of code. (#26312 )	2022-07-08 13:21:54 -07:00
Sven Mika	f8785c49df	[RLlib] Issue 25696: Output writers not working w/ multiple workers. (#25722 )	2022-06-30 13:25:56 +02:00
Jun Gong	d83bbda281	[RLlib] Save serialized PolicySpec. Extract `num_gpus` related logics into a util function. (#25954 )	2022-06-30 11:38:21 +02:00
Jun Gong	52bb8e47d4	[RLlib] EnvRunnerV2 and EpisodeV2 that support Connectors. (#25922 )	2022-06-30 08:44:10 +02:00
Artur Niederfahrenhorst	64a0eae758	simplexfix (#26122 )	2022-06-27 08:25:19 -07:00
Artur Niederfahrenhorst	bed9083f35	[RLlib] Add timeout to filter synchronization. (#25959 )	2022-06-24 14:37:43 +02:00
Jun Gong	257e67474c	[RLlib] introduce serialization for our custom gym space types. (#25923 )	2022-06-23 22:55:57 -07:00
Jun Gong	8c9cac350d	Fix unit test test_check_env.py and est_check_multi_agent.py. (#25993 )	2022-06-23 22:55:41 -07:00
Artur Niederfahrenhorst	a3f1323457	[RLlib] Make QMix use the ReplayBufferAPI (#25560 )	2022-06-23 22:55:22 -07:00
Sven Mika	59a967a3a0	[RLlib] Cleanup some deprecated metric keys and classes. (#26036 )	2022-06-23 21:30:01 +02:00
Eric Liang	43aa2299e6	[api] Annotate as public / move ray-core APIs to _private and add enforcement rule (#25695 ) Enable checking of the ray core module, excluding serve, workflows, and tune, in ./ci/lint/check_api_annotations.py. This required moving many files to ray._private and associated fixes.	2022-06-21 15:13:29 -07:00
Sven Mika	96693055bd	[RLlib] More Trainer -> Algorithm renaming cleanups. (#25869 )	2022-06-20 15:54:00 +02:00
Sven Mika	d90c6cfbd6	[RLlib] SimpleQ PolicyV2 (sub-classing). (#25871 )	2022-06-17 20:12:16 +02:00
Artur Niederfahrenhorst	a322cc5765	[RLlib] IMPALA/APPO multi-agent mix-in-buffer fixes (plus MA learning tests). (#25848 )	2022-06-17 14:10:36 +02:00
Yi Cheng	7b8b0f8e03	Revert "[RLlib] Remove execution plan code no longer used by RLlib. (#25624 )" (#25776 ) This reverts commit `804719876b`.	2022-06-14 13:59:15 -07:00
Jun Gong	c026374acb	[RLlib] Fix the 2 failing RLlib release tests. (#25603 )	2022-06-14 14:51:08 +02:00
Avnish Narayan	804719876b	[RLlib] Remove execution plan code no longer used by RLlib. (#25624 )	2022-06-14 10:57:27 +02:00
Sven Mika	130b7eeaba	[RLlib] `Trainer` to `Algorithm` renaming. (#25539 )	2022-06-11 15:10:39 +02:00
Artur Niederfahrenhorst	94d6c212df	[RLlib] Replay Buffer API documentation. (#24683 )	2022-06-10 16:47:51 +02:00
Artur Niederfahrenhorst	9226643433	[RLlib] Issue 4965: Fixes PyTorch grad clipping logic and adds grad clipping to QMIX. (#25584 )	2022-06-08 19:40:57 +02:00
Jun Gong	9b65d5535d	[RLlib] Introduce basic connectors library. (#25311 )	2022-06-07 19:18:14 +02:00

1 2 3 4 5 ...

411 commits