Commit graph

1387 commits

Author SHA1 Message Date
Sven Mika
59a967a3a0
[RLlib] Cleanup some deprecated metric keys and classes. (#26036) 2022-06-23 21:30:01 +02:00
JYX
bde46e8a88
Fix several typos in rollout_worker.py (#26028) 2022-06-23 11:41:53 -07:00
Sven Mika
be1042429d
[RLlib] Deprecation: Replace remaining evaluation_num_episodes with evaluation_duration. (#26000) 2022-06-23 19:11:29 +02:00
Kai Fricke
8a2f6bda62
[tune/structure] Introduce experiment package (#26033)
Experiment, Trial, and config parsing moves into an `experiment` package.

Notably, the new public facing APIs will be

```
from ray.tune.experiment import Experiment
from ray.tune.experiment import Trial
```
2022-06-23 14:52:46 +01:00
Kai Fricke
0959f44b6f
[tune/structure] Introduce execution package (#26015)
Execution-specific packages are moved to tune.execution.

Co-authored-by: Xiaowei Jiang <xwjiang2010@gmail.com>
2022-06-23 11:13:19 +01:00
Sven Mika
3d6df50258
[RLlib] Fix get_num_samples_loaded_into_buffer in TorchPolicyV2. (#25956) 2022-06-22 13:11:41 +02:00
Avnish Narayan
871aef80dc
[RLlib] Aggregate Impala learner info. (#25856) 2022-06-22 09:43:10 +02:00
Eric Liang
43aa2299e6
[api] Annotate as public / move ray-core APIs to _private and add enforcement rule (#25695)
Enable checking of the ray core module, excluding serve, workflows, and tune, in ./ci/lint/check_api_annotations.py. This required moving many files to ray._private and associated fixes.
2022-06-21 15:13:29 -07:00
Artur Niederfahrenhorst
dcbc225728
[RLlib] Fix DDPG test ignoring framework_iterator-modified config. (#25913) 2022-06-21 16:17:42 +02:00
Avnish Narayan
d859b84058
[RLlib] Add compute log likelihoods test for CRR. (#25905) 2022-06-21 16:06:10 +02:00
Rohan Potdar
28df3f34f5
[RLlib]: Off-Policy Evaluation fixes. (#25899) 2022-06-21 13:24:24 +02:00
Artur Niederfahrenhorst
e10876604d
[RLlib] Include SampleBatch.T column in all collected batches. (#25926) 2022-06-21 13:20:22 +02:00
Sven Mika
1499af945b
[RLlib] Algorithm step() fixes: evaluation should NOT be part of timed training_step loop. (#25924) 2022-06-20 19:53:47 +02:00
Sven Mika
96693055bd
[RLlib] More Trainer -> Algorithm renaming cleanups. (#25869) 2022-06-20 15:54:00 +02:00
Sven Mika
d90c6cfbd6
[RLlib] SimpleQ PolicyV2 (sub-classing). (#25871) 2022-06-17 20:12:16 +02:00
Fabian Witter
fcdf710574
[RLlib] Move offline input into replay buffer using rollout ops in CQL. (#25629) 2022-06-17 17:08:55 +02:00
Artur Niederfahrenhorst
a322cc5765
[RLlib] IMPALA/APPO multi-agent mix-in-buffer fixes (plus MA learning tests). (#25848) 2022-06-17 14:10:36 +02:00
Artur Niederfahrenhorst
e5740946b8
[RLlib] Fixes logging of all of RLlib's Algorithm names as warning messages. (#25840) 2022-06-17 08:41:18 +02:00
Avnish Narayan
393cf4d8f7
[RLlib] Fix action_sampler_fn call in TorchPolicyV2 (obs_batch instead of input_dict arg). (#25877) 2022-06-17 08:39:39 +02:00
Artur Niederfahrenhorst
f34cd2fd8f
[RLlib] Take replay buffer api example out of GPU examples. (#25841) 2022-06-16 19:12:38 +02:00
Yi Cheng
7b8b0f8e03
Revert "[RLlib] Remove execution plan code no longer used by RLlib. (#25624)" (#25776)
This reverts commit 804719876b.
2022-06-14 13:59:15 -07:00
Jun Gong
c026374acb
[RLlib] Fix the 2 failing RLlib release tests. (#25603) 2022-06-14 14:51:08 +02:00
Kai Fricke
6313ddc47c
[tune] Refactor Syncer / deprecate Sync client (#25655)
This PR includes / depends on #25709

The two concepts of Syncer and SyncClient are confusing, as is the current API for passing custom sync functions.

This PR refactors Tune's syncing behavior. The Sync client concept is hard deprecated. Instead, we offer a well defined Syncer API that can be extended to provide own syncing functionality. However, the default will be to use Ray AIRs file transfer utilities.

New API:
- Users can pass `syncer=CustomSyncer` which implements the `Syncer` API
- Otherwise our off-the-shelf syncing is used
- As before, syncing to cloud disables syncing to driver

Changes:
- Sync client is removed
- Syncer interface introduced
- _DefaultSyncer is a wrapper around the URI upload/download API from Ray AIR
- SyncerCallback only uses remote tasks to synchronize data
- Rsync syncing is fully depracated and removed
- Docker and kubernetes-specific syncing is fully deprecated and removed
- Testing is improved to use `file://` URIs instead of mock sync clients
2022-06-14 14:46:30 +02:00
kourosh hakhamaneshi
f597e21ac8
[RLlib] Fix sample batch concat samples. (#25572) 2022-06-14 12:47:29 +02:00
kourosh hakhamaneshi
25940cb95b
[RLlib] CRR documentation. (#25667) 2022-06-14 12:45:36 +02:00
Avnish Narayan
804719876b
[RLlib] Remove execution plan code no longer used by RLlib. (#25624) 2022-06-14 10:57:27 +02:00
Kai Fricke
736c7b13c4
[CI] Fix team to rllib (from ml) for some replay buffer API tests. (#25702) 2022-06-11 18:05:16 +02:00
Sven Mika
130b7eeaba
[RLlib] Trainer to Algorithm renaming. (#25539) 2022-06-11 15:10:39 +02:00
Sven Mika
7c39aa5fac
[RLlib] Trainer.training_iteration -> Trainer.training_step; Iterations vs reportings: Clarification of terms. (#25076) 2022-06-10 17:09:18 +02:00
Artur Niederfahrenhorst
94d6c212df
[RLlib] Replay Buffer API documentation. (#24683) 2022-06-10 16:47:51 +02:00
Artur Niederfahrenhorst
c3645928ca
[RLlib] Fix no gradient clipping happening in QMix. (#25656) 2022-06-10 13:51:26 +02:00
Avnish Narayan
730df43656
[RLlib] Issue 25503: Replace torch.range with torch.arange. (#25640) 2022-06-10 13:21:54 +02:00
kourosh hakhamaneshi
b3a351925d
[RLlib] Added meaningful error for multi-agent failure of SampleCollector in case no agent steps in episode. (#25596) 2022-06-10 12:30:43 +02:00
Artur Niederfahrenhorst
8af9ef8fee
[RLlib] Discussion 6432: Automatic train_batch_size calculation fix. (#25621) 2022-06-10 12:15:57 +02:00
Artur Niederfahrenhorst
7495e9c89c
[RLlib] Dreamer Policy sub-classing schema. (#25585) 2022-06-09 17:14:15 +02:00
Kai Fricke
aa142eb377
[RLlib; CI] Add team:rllib tag for Bazel. (#25589)
Currently, team:ml spans all ML (Tune, Train, AIR) tests and rllib tests. rllib tests are much more flaky and it would be good to split them up in the flaky test tracker. This PR changes Rllib-tests from team:ml to team:rllib to enable this separation.

Co-authored-by: Antoni Baum <antoni.baum@protonmail.com>
2022-06-08 22:25:59 +01:00
Artur Niederfahrenhorst
9226643433
[RLlib] Issue 4965: Fixes PyTorch grad clipping logic and adds grad clipping to QMIX. (#25584) 2022-06-08 19:40:57 +02:00
Sven Mika
388fb98c79
[RLlib] CRR Tests fixes. (#25586) 2022-06-08 19:18:55 +02:00
Kai Fricke
8affbc7be6
[tune/train] Consolidate checkpoint manager 3: Ray Tune (#24430)
**Update**: This PR is now part 3 of a three PR group to consolidate the checkpoints.

1. Part 1 adds the common checkpoint management class #24771 
2. Part 2 adds the integration for Ray Train #24772
3. This PR builds on #24772 and includes all changes. It moves the Ray Tune integration to use the new common checkpoint manager class.

Old PR description:

This PR consolidates the Ray Train and Tune checkpoint managers. These concepts previously did something very similar but in different modules. To simplify maintenance in the future, we've consolidated the common core.

- This PR keeps full compatibility with the previous interfaces and implementations. This means that for now, Train and Tune will have separate CheckpointManagers that both extend the common core
- This PR prepares Tune to move to a CheckpointStrategy object
- In follow-up PRs, we can further unify interfacing with the common core, possibly removing any train- or tune-specific adjustments (e.g. moving to setup on init rather on runtime for Ray Train)

Co-authored-by: Antoni Baum <antoni.baum@protonmail.com>
2022-06-08 12:05:34 +01:00
kourosh hakhamaneshi
4cdd508f70
[RLlib] Added CRR implementation. (#25499) 2022-06-08 11:42:02 +02:00
Jun Gong
9b65d5535d
[RLlib] Introduce basic connectors library. (#25311) 2022-06-07 19:18:14 +02:00
Rohan Potdar
a9d8da0100
[RLlib]: Doubly Robust Off-Policy Evaluation. (#25056) 2022-06-07 12:52:19 +02:00
Artur Niederfahrenhorst
429d0f0eee
[RLlib] Fix multi agent environment checks for observations that contain only some agents' obs each step. (#25506) 2022-06-07 10:33:35 +02:00
Artur Niederfahrenhorst
35bd397181
[RLlib] Better default values for training_intensity and target_network_update_freq for R2D2. (#25510) 2022-06-07 10:29:56 +02:00
Vince Jankovics
68444cd390
[tune] Custom resources per worker added to default_resource_request (#24463)
This resolves the `TODO(ekl): add custom resources here once tune supports them` item. 
Also, related to the discussion [here](https://discuss.ray.io/t/reserve-workers-on-gpu-node-for-trainer-workers-only/5972/5).

Co-authored-by: Kai Fricke <kai@anyscale.com>
2022-06-06 22:41:02 +01:00
Artur Niederfahrenhorst
5133978adc
[RLlib] PG policy subclassing conversion. (#25288) 2022-06-06 13:07:47 +02:00
Artur Niederfahrenhorst
243038d00a
[RLlib] Issue 25401: Faulty usage of get_filter_config in ComplexInputNetworks (#25493) 2022-06-06 13:04:17 +02:00
kourosh hakhamaneshi
d49d0efbaf
[RLlib] Bug fix: when on GPU, sample_batch.to_device() only converts the device and does not convert float64 to float32. (#25460) 2022-06-06 12:43:11 +02:00
Artur Niederfahrenhorst
c4a0e9d0f2
[RLlib] Disambiguate timestep fragment storage unit in replay buffers. (#25242) 2022-06-06 11:35:49 +02:00
Jun Gong
644b80c0ef
[RLlib] mark learning and examples tests exclusive. (#25445) 2022-06-04 09:35:24 -07:00