hiro/ray - Forgejo: Beyond coding. We Forge.

hiro/ray

mirror of https://github.com/vale981/ray synced 2025-03-06 10:31:39 -05:00

Author	SHA1	Message	Date
Artur Niederfahrenhorst	11a549d4bd	[RLlib] Small Ape-X deflake. (#26078 )	2022-06-29 14:06:42 +02:00
Sven Mika	2b43713785	[RLlib] Move IMPALA and APPO back to exec plan (for now; due to unresolved learning/performance issues). (#25851 )	2022-06-29 08:41:47 +02:00
simonsays1980	05d3af766c	[RLlib] Added 'episode.hist_data' to the 'atari_metrics' to nsure that custom metrics of the user are kept in postprocessing when using Atari environments. (#25292 )	2022-06-28 16:31:57 +02:00
Charles Sun	70f94e6d63	[RLlib] Migrating DDPG to PolicyV2. (#26054 )	2022-06-28 15:52:56 +02:00
kourosh hakhamaneshi	f421730b47	[RLlib] Added `expectation` advantage_type option to CRR. (#26142 )	2022-06-28 15:40:09 +02:00
Sven Mika	762cfbdff1	[RLlib] IMPALA and APPO metrics fixes; remove deprecated `async_parallel_requests` utility. (#26117 )	2022-06-28 15:14:37 +02:00
Artur Niederfahrenhorst	efea87f0cb	[RLlib] SimpleQ PyTorch Multi GPU fix (#26109 )	2022-06-28 12:12:56 +02:00
Artur Niederfahrenhorst	64a0eae758	simplexfix (#26122 )	2022-06-27 08:25:19 -07:00
Artur Niederfahrenhorst	bed9083f35	[RLlib] Add timeout to filter synchronization. (#25959 )	2022-06-24 14:37:43 +02:00
Jun Gong	257e67474c	[RLlib] introduce serialization for our custom gym space types. (#25923 )	2022-06-23 22:55:57 -07:00
Jun Gong	8c9cac350d	Fix unit test test_check_env.py and est_check_multi_agent.py. (#25993 )	2022-06-23 22:55:41 -07:00
Artur Niederfahrenhorst	a3f1323457	[RLlib] Make QMix use the ReplayBufferAPI (#25560 )	2022-06-23 22:55:22 -07:00
Sven Mika	59a967a3a0	[RLlib] Cleanup some deprecated metric keys and classes. (#26036 )	2022-06-23 21:30:01 +02:00
JYX	bde46e8a88	Fix several typos in rollout_worker.py (#26028 )	2022-06-23 11:41:53 -07:00
Sven Mika	be1042429d	[RLlib] Deprecation: Replace remaining `evaluation_num_episodes` with `evaluation_duration`. (#26000 )	2022-06-23 19:11:29 +02:00
Kai Fricke	8a2f6bda62	[tune/structure] Introduce experiment package (#26033 ) Experiment, Trial, and config parsing moves into an `experiment` package. Notably, the new public facing APIs will be ``` from ray.tune.experiment import Experiment from ray.tune.experiment import Trial ```	2022-06-23 14:52:46 +01:00
Kai Fricke	0959f44b6f	[tune/structure] Introduce execution package (#26015 ) Execution-specific packages are moved to tune.execution. Co-authored-by: Xiaowei Jiang <xwjiang2010@gmail.com>	2022-06-23 11:13:19 +01:00
Sven Mika	3d6df50258	[RLlib] Fix `get_num_samples_loaded_into_buffer` in TorchPolicyV2. (#25956 )	2022-06-22 13:11:41 +02:00
Avnish Narayan	871aef80dc	[RLlib] Aggregate Impala learner info. (#25856 )	2022-06-22 09:43:10 +02:00
Eric Liang	43aa2299e6	[api] Annotate as public / move ray-core APIs to _private and add enforcement rule (#25695 ) Enable checking of the ray core module, excluding serve, workflows, and tune, in ./ci/lint/check_api_annotations.py. This required moving many files to ray._private and associated fixes.	2022-06-21 15:13:29 -07:00
Artur Niederfahrenhorst	dcbc225728	[RLlib] Fix DDPG test ignoring `framework_iterator`-modified config. (#25913 )	2022-06-21 16:17:42 +02:00
Avnish Narayan	d859b84058	[RLlib] Add compute log likelihoods test for CRR. (#25905 )	2022-06-21 16:06:10 +02:00
Rohan Potdar	28df3f34f5	[RLlib]: Off-Policy Evaluation fixes. (#25899 )	2022-06-21 13:24:24 +02:00
Artur Niederfahrenhorst	e10876604d	[RLlib] Include SampleBatch.T column in all collected batches. (#25926 )	2022-06-21 13:20:22 +02:00
Sven Mika	1499af945b	[RLlib] Algorithm `step()` fixes: evaluation should NOT be part of timed `training_step` loop. (#25924 )	2022-06-20 19:53:47 +02:00
Sven Mika	96693055bd	[RLlib] More Trainer -> Algorithm renaming cleanups. (#25869 )	2022-06-20 15:54:00 +02:00
Sven Mika	d90c6cfbd6	[RLlib] SimpleQ PolicyV2 (sub-classing). (#25871 )	2022-06-17 20:12:16 +02:00
Fabian Witter	fcdf710574	[RLlib] Move offline input into replay buffer using rollout ops in CQL. (#25629 )	2022-06-17 17:08:55 +02:00
Artur Niederfahrenhorst	a322cc5765	[RLlib] IMPALA/APPO multi-agent mix-in-buffer fixes (plus MA learning tests). (#25848 )	2022-06-17 14:10:36 +02:00
Artur Niederfahrenhorst	e5740946b8	[RLlib] Fixes logging of all of RLlib's Algorithm names as warning messages. (#25840 )	2022-06-17 08:41:18 +02:00
Avnish Narayan	393cf4d8f7	[RLlib] Fix `action_sampler_fn` call in `TorchPolicyV2` (`obs_batch` instead of `input_dict` arg). (#25877 )	2022-06-17 08:39:39 +02:00
Artur Niederfahrenhorst	f34cd2fd8f	[RLlib] Take replay buffer api example out of GPU examples. (#25841 )	2022-06-16 19:12:38 +02:00
Yi Cheng	7b8b0f8e03	Revert "[RLlib] Remove execution plan code no longer used by RLlib. (#25624 )" (#25776 ) This reverts commit `804719876b`.	2022-06-14 13:59:15 -07:00
Jun Gong	c026374acb	[RLlib] Fix the 2 failing RLlib release tests. (#25603 )	2022-06-14 14:51:08 +02:00
Kai Fricke	6313ddc47c	[tune] Refactor Syncer / deprecate Sync client (#25655 ) This PR includes / depends on #25709 The two concepts of Syncer and SyncClient are confusing, as is the current API for passing custom sync functions. This PR refactors Tune's syncing behavior. The Sync client concept is hard deprecated. Instead, we offer a well defined Syncer API that can be extended to provide own syncing functionality. However, the default will be to use Ray AIRs file transfer utilities. New API: - Users can pass `syncer=CustomSyncer` which implements the `Syncer` API - Otherwise our off-the-shelf syncing is used - As before, syncing to cloud disables syncing to driver Changes: - Sync client is removed - Syncer interface introduced - _DefaultSyncer is a wrapper around the URI upload/download API from Ray AIR - SyncerCallback only uses remote tasks to synchronize data - Rsync syncing is fully depracated and removed - Docker and kubernetes-specific syncing is fully deprecated and removed - Testing is improved to use `file://` URIs instead of mock sync clients	2022-06-14 14:46:30 +02:00
kourosh hakhamaneshi	f597e21ac8	[RLlib] Fix sample batch concat samples. (#25572 )	2022-06-14 12:47:29 +02:00
kourosh hakhamaneshi	25940cb95b	[RLlib] CRR documentation. (#25667 )	2022-06-14 12:45:36 +02:00
Avnish Narayan	804719876b	[RLlib] Remove execution plan code no longer used by RLlib. (#25624 )	2022-06-14 10:57:27 +02:00
Kai Fricke	736c7b13c4	[CI] Fix team to `rllib` (from `ml`) for some replay buffer API tests. (#25702 )	2022-06-11 18:05:16 +02:00
Sven Mika	130b7eeaba	[RLlib] `Trainer` to `Algorithm` renaming. (#25539 )	2022-06-11 15:10:39 +02:00
Sven Mika	7c39aa5fac	[RLlib] Trainer.training_iteration -> Trainer.training_step; Iterations vs reportings: Clarification of terms. (#25076 )	2022-06-10 17:09:18 +02:00
Artur Niederfahrenhorst	94d6c212df	[RLlib] Replay Buffer API documentation. (#24683 )	2022-06-10 16:47:51 +02:00
Artur Niederfahrenhorst	c3645928ca	[RLlib] Fix no gradient clipping happening in QMix. (#25656 )	2022-06-10 13:51:26 +02:00
Avnish Narayan	730df43656	[RLlib] Issue 25503: Replace torch.range with torch.arange. (#25640 )	2022-06-10 13:21:54 +02:00
kourosh hakhamaneshi	b3a351925d	[RLlib] Added meaningful error for multi-agent failure of SampleCollector in case no agent steps in episode. (#25596 )	2022-06-10 12:30:43 +02:00
Artur Niederfahrenhorst	8af9ef8fee	[RLlib] Discussion 6432: Automatic `train_batch_size` calculation fix. (#25621 )	2022-06-10 12:15:57 +02:00
Artur Niederfahrenhorst	7495e9c89c	[RLlib] Dreamer Policy sub-classing schema. (#25585 )	2022-06-09 17:14:15 +02:00
Kai Fricke	aa142eb377	[RLlib; CI] Add `team:rllib` tag for Bazel. (#25589 ) Currently, team:ml spans all ML (Tune, Train, AIR) tests and rllib tests. rllib tests are much more flaky and it would be good to split them up in the flaky test tracker. This PR changes Rllib-tests from team:ml to team:rllib to enable this separation. Co-authored-by: Antoni Baum <antoni.baum@protonmail.com>	2022-06-08 22:25:59 +01:00
Artur Niederfahrenhorst	9226643433	[RLlib] Issue 4965: Fixes PyTorch grad clipping logic and adds grad clipping to QMIX. (#25584 )	2022-06-08 19:40:57 +02:00
Sven Mika	388fb98c79	[RLlib] CRR Tests fixes. (#25586 )	2022-06-08 19:18:55 +02:00

1 2 3 4 5 ...

1299 commits