Commit graph

1299 commits

Author SHA1 Message Date
Artur Niederfahrenhorst
11a549d4bd
[RLlib] Small Ape-X deflake. (#26078) 2022-06-29 14:06:42 +02:00
Sven Mika
2b43713785
[RLlib] Move IMPALA and APPO back to exec plan (for now; due to unresolved learning/performance issues). (#25851) 2022-06-29 08:41:47 +02:00
simonsays1980
05d3af766c
[RLlib] Added 'episode.hist_data' to the 'atari_metrics' to nsure that custom metrics of the user are kept in postprocessing when using Atari environments. (#25292) 2022-06-28 16:31:57 +02:00
Charles Sun
70f94e6d63
[RLlib] Migrating DDPG to PolicyV2. (#26054) 2022-06-28 15:52:56 +02:00
kourosh hakhamaneshi
f421730b47
[RLlib] Added expectation advantage_type option to CRR. (#26142) 2022-06-28 15:40:09 +02:00
Sven Mika
762cfbdff1
[RLlib] IMPALA and APPO metrics fixes; remove deprecated async_parallel_requests utility. (#26117) 2022-06-28 15:14:37 +02:00
Artur Niederfahrenhorst
efea87f0cb
[RLlib] SimpleQ PyTorch Multi GPU fix (#26109) 2022-06-28 12:12:56 +02:00
Artur Niederfahrenhorst
64a0eae758
simplexfix (#26122) 2022-06-27 08:25:19 -07:00
Artur Niederfahrenhorst
bed9083f35
[RLlib] Add timeout to filter synchronization. (#25959) 2022-06-24 14:37:43 +02:00
Jun Gong
257e67474c
[RLlib] introduce serialization for our custom gym space types. (#25923) 2022-06-23 22:55:57 -07:00
Jun Gong
8c9cac350d
Fix unit test test_check_env.py and est_check_multi_agent.py. (#25993) 2022-06-23 22:55:41 -07:00
Artur Niederfahrenhorst
a3f1323457
[RLlib] Make QMix use the ReplayBufferAPI (#25560) 2022-06-23 22:55:22 -07:00
Sven Mika
59a967a3a0
[RLlib] Cleanup some deprecated metric keys and classes. (#26036) 2022-06-23 21:30:01 +02:00
JYX
bde46e8a88
Fix several typos in rollout_worker.py (#26028) 2022-06-23 11:41:53 -07:00
Sven Mika
be1042429d
[RLlib] Deprecation: Replace remaining evaluation_num_episodes with evaluation_duration. (#26000) 2022-06-23 19:11:29 +02:00
Kai Fricke
8a2f6bda62
[tune/structure] Introduce experiment package (#26033)
Experiment, Trial, and config parsing moves into an `experiment` package.

Notably, the new public facing APIs will be

```
from ray.tune.experiment import Experiment
from ray.tune.experiment import Trial
```
2022-06-23 14:52:46 +01:00
Kai Fricke
0959f44b6f
[tune/structure] Introduce execution package (#26015)
Execution-specific packages are moved to tune.execution.

Co-authored-by: Xiaowei Jiang <xwjiang2010@gmail.com>
2022-06-23 11:13:19 +01:00
Sven Mika
3d6df50258
[RLlib] Fix get_num_samples_loaded_into_buffer in TorchPolicyV2. (#25956) 2022-06-22 13:11:41 +02:00
Avnish Narayan
871aef80dc
[RLlib] Aggregate Impala learner info. (#25856) 2022-06-22 09:43:10 +02:00
Eric Liang
43aa2299e6
[api] Annotate as public / move ray-core APIs to _private and add enforcement rule (#25695)
Enable checking of the ray core module, excluding serve, workflows, and tune, in ./ci/lint/check_api_annotations.py. This required moving many files to ray._private and associated fixes.
2022-06-21 15:13:29 -07:00
Artur Niederfahrenhorst
dcbc225728
[RLlib] Fix DDPG test ignoring framework_iterator-modified config. (#25913) 2022-06-21 16:17:42 +02:00
Avnish Narayan
d859b84058
[RLlib] Add compute log likelihoods test for CRR. (#25905) 2022-06-21 16:06:10 +02:00
Rohan Potdar
28df3f34f5
[RLlib]: Off-Policy Evaluation fixes. (#25899) 2022-06-21 13:24:24 +02:00
Artur Niederfahrenhorst
e10876604d
[RLlib] Include SampleBatch.T column in all collected batches. (#25926) 2022-06-21 13:20:22 +02:00
Sven Mika
1499af945b
[RLlib] Algorithm step() fixes: evaluation should NOT be part of timed training_step loop. (#25924) 2022-06-20 19:53:47 +02:00
Sven Mika
96693055bd
[RLlib] More Trainer -> Algorithm renaming cleanups. (#25869) 2022-06-20 15:54:00 +02:00
Sven Mika
d90c6cfbd6
[RLlib] SimpleQ PolicyV2 (sub-classing). (#25871) 2022-06-17 20:12:16 +02:00
Fabian Witter
fcdf710574
[RLlib] Move offline input into replay buffer using rollout ops in CQL. (#25629) 2022-06-17 17:08:55 +02:00
Artur Niederfahrenhorst
a322cc5765
[RLlib] IMPALA/APPO multi-agent mix-in-buffer fixes (plus MA learning tests). (#25848) 2022-06-17 14:10:36 +02:00
Artur Niederfahrenhorst
e5740946b8
[RLlib] Fixes logging of all of RLlib's Algorithm names as warning messages. (#25840) 2022-06-17 08:41:18 +02:00
Avnish Narayan
393cf4d8f7
[RLlib] Fix action_sampler_fn call in TorchPolicyV2 (obs_batch instead of input_dict arg). (#25877) 2022-06-17 08:39:39 +02:00
Artur Niederfahrenhorst
f34cd2fd8f
[RLlib] Take replay buffer api example out of GPU examples. (#25841) 2022-06-16 19:12:38 +02:00
Yi Cheng
7b8b0f8e03
Revert "[RLlib] Remove execution plan code no longer used by RLlib. (#25624)" (#25776)
This reverts commit 804719876b.
2022-06-14 13:59:15 -07:00
Jun Gong
c026374acb
[RLlib] Fix the 2 failing RLlib release tests. (#25603) 2022-06-14 14:51:08 +02:00
Kai Fricke
6313ddc47c
[tune] Refactor Syncer / deprecate Sync client (#25655)
This PR includes / depends on #25709

The two concepts of Syncer and SyncClient are confusing, as is the current API for passing custom sync functions.

This PR refactors Tune's syncing behavior. The Sync client concept is hard deprecated. Instead, we offer a well defined Syncer API that can be extended to provide own syncing functionality. However, the default will be to use Ray AIRs file transfer utilities.

New API:
- Users can pass `syncer=CustomSyncer` which implements the `Syncer` API
- Otherwise our off-the-shelf syncing is used
- As before, syncing to cloud disables syncing to driver

Changes:
- Sync client is removed
- Syncer interface introduced
- _DefaultSyncer is a wrapper around the URI upload/download API from Ray AIR
- SyncerCallback only uses remote tasks to synchronize data
- Rsync syncing is fully depracated and removed
- Docker and kubernetes-specific syncing is fully deprecated and removed
- Testing is improved to use `file://` URIs instead of mock sync clients
2022-06-14 14:46:30 +02:00
kourosh hakhamaneshi
f597e21ac8
[RLlib] Fix sample batch concat samples. (#25572) 2022-06-14 12:47:29 +02:00
kourosh hakhamaneshi
25940cb95b
[RLlib] CRR documentation. (#25667) 2022-06-14 12:45:36 +02:00
Avnish Narayan
804719876b
[RLlib] Remove execution plan code no longer used by RLlib. (#25624) 2022-06-14 10:57:27 +02:00
Kai Fricke
736c7b13c4
[CI] Fix team to rllib (from ml) for some replay buffer API tests. (#25702) 2022-06-11 18:05:16 +02:00
Sven Mika
130b7eeaba
[RLlib] Trainer to Algorithm renaming. (#25539) 2022-06-11 15:10:39 +02:00
Sven Mika
7c39aa5fac
[RLlib] Trainer.training_iteration -> Trainer.training_step; Iterations vs reportings: Clarification of terms. (#25076) 2022-06-10 17:09:18 +02:00
Artur Niederfahrenhorst
94d6c212df
[RLlib] Replay Buffer API documentation. (#24683) 2022-06-10 16:47:51 +02:00
Artur Niederfahrenhorst
c3645928ca
[RLlib] Fix no gradient clipping happening in QMix. (#25656) 2022-06-10 13:51:26 +02:00
Avnish Narayan
730df43656
[RLlib] Issue 25503: Replace torch.range with torch.arange. (#25640) 2022-06-10 13:21:54 +02:00
kourosh hakhamaneshi
b3a351925d
[RLlib] Added meaningful error for multi-agent failure of SampleCollector in case no agent steps in episode. (#25596) 2022-06-10 12:30:43 +02:00
Artur Niederfahrenhorst
8af9ef8fee
[RLlib] Discussion 6432: Automatic train_batch_size calculation fix. (#25621) 2022-06-10 12:15:57 +02:00
Artur Niederfahrenhorst
7495e9c89c
[RLlib] Dreamer Policy sub-classing schema. (#25585) 2022-06-09 17:14:15 +02:00
Kai Fricke
aa142eb377
[RLlib; CI] Add team:rllib tag for Bazel. (#25589)
Currently, team:ml spans all ML (Tune, Train, AIR) tests and rllib tests. rllib tests are much more flaky and it would be good to split them up in the flaky test tracker. This PR changes Rllib-tests from team:ml to team:rllib to enable this separation.

Co-authored-by: Antoni Baum <antoni.baum@protonmail.com>
2022-06-08 22:25:59 +01:00
Artur Niederfahrenhorst
9226643433
[RLlib] Issue 4965: Fixes PyTorch grad clipping logic and adds grad clipping to QMIX. (#25584) 2022-06-08 19:40:57 +02:00
Sven Mika
388fb98c79
[RLlib] CRR Tests fixes. (#25586) 2022-06-08 19:18:55 +02:00