Artur Niederfahrenhorst
efea87f0cb
[RLlib] SimpleQ PyTorch Multi GPU fix ( #26109 )
2022-06-28 12:12:56 +02:00
Artur Niederfahrenhorst
bed9083f35
[RLlib] Add timeout to filter synchronization. ( #25959 )
2022-06-24 14:37:43 +02:00
Artur Niederfahrenhorst
a3f1323457
[RLlib] Make QMix use the ReplayBufferAPI ( #25560 )
2022-06-23 22:55:22 -07:00
Sven Mika
59a967a3a0
[RLlib] Cleanup some deprecated metric keys and classes. ( #26036 )
2022-06-23 21:30:01 +02:00
Kai Fricke
8a2f6bda62
[tune/structure] Introduce experiment package ( #26033 )
...
Experiment, Trial, and config parsing moves into an `experiment` package.
Notably, the new public facing APIs will be
```
from ray.tune.experiment import Experiment
from ray.tune.experiment import Trial
```
2022-06-23 14:52:46 +01:00
Kai Fricke
0959f44b6f
[tune/structure] Introduce execution package ( #26015 )
...
Execution-specific packages are moved to tune.execution.
Co-authored-by: Xiaowei Jiang <xwjiang2010@gmail.com>
2022-06-23 11:13:19 +01:00
Avnish Narayan
871aef80dc
[RLlib] Aggregate Impala learner info. ( #25856 )
2022-06-22 09:43:10 +02:00
Artur Niederfahrenhorst
dcbc225728
[RLlib] Fix DDPG test ignoring framework_iterator
-modified config. ( #25913 )
2022-06-21 16:17:42 +02:00
Avnish Narayan
d859b84058
[RLlib] Add compute log likelihoods test for CRR. ( #25905 )
2022-06-21 16:06:10 +02:00
Sven Mika
1499af945b
[RLlib] Algorithm step()
fixes: evaluation should NOT be part of timed training_step
loop. ( #25924 )
2022-06-20 19:53:47 +02:00
Sven Mika
96693055bd
[RLlib] More Trainer -> Algorithm renaming cleanups. ( #25869 )
2022-06-20 15:54:00 +02:00
Sven Mika
d90c6cfbd6
[RLlib] SimpleQ PolicyV2 (sub-classing). ( #25871 )
2022-06-17 20:12:16 +02:00
Fabian Witter
fcdf710574
[RLlib] Move offline input into replay buffer using rollout ops in CQL. ( #25629 )
2022-06-17 17:08:55 +02:00
Artur Niederfahrenhorst
a322cc5765
[RLlib] IMPALA/APPO multi-agent mix-in-buffer fixes (plus MA learning tests). ( #25848 )
2022-06-17 14:10:36 +02:00
Artur Niederfahrenhorst
e5740946b8
[RLlib] Fixes logging of all of RLlib's Algorithm names as warning messages. ( #25840 )
2022-06-17 08:41:18 +02:00
Yi Cheng
7b8b0f8e03
Revert "[RLlib] Remove execution plan code no longer used by RLlib. ( #25624 )" ( #25776 )
...
This reverts commit 804719876b
.
2022-06-14 13:59:15 -07:00
Kai Fricke
6313ddc47c
[tune] Refactor Syncer / deprecate Sync client ( #25655 )
...
This PR includes / depends on #25709
The two concepts of Syncer and SyncClient are confusing, as is the current API for passing custom sync functions.
This PR refactors Tune's syncing behavior. The Sync client concept is hard deprecated. Instead, we offer a well defined Syncer API that can be extended to provide own syncing functionality. However, the default will be to use Ray AIRs file transfer utilities.
New API:
- Users can pass `syncer=CustomSyncer` which implements the `Syncer` API
- Otherwise our off-the-shelf syncing is used
- As before, syncing to cloud disables syncing to driver
Changes:
- Sync client is removed
- Syncer interface introduced
- _DefaultSyncer is a wrapper around the URI upload/download API from Ray AIR
- SyncerCallback only uses remote tasks to synchronize data
- Rsync syncing is fully depracated and removed
- Docker and kubernetes-specific syncing is fully deprecated and removed
- Testing is improved to use `file://` URIs instead of mock sync clients
2022-06-14 14:46:30 +02:00
kourosh hakhamaneshi
25940cb95b
[RLlib] CRR documentation. ( #25667 )
2022-06-14 12:45:36 +02:00
Avnish Narayan
804719876b
[RLlib] Remove execution plan code no longer used by RLlib. ( #25624 )
2022-06-14 10:57:27 +02:00
Sven Mika
130b7eeaba
[RLlib] Trainer
to Algorithm
renaming. ( #25539 )
2022-06-11 15:10:39 +02:00
Sven Mika
7c39aa5fac
[RLlib] Trainer.training_iteration -> Trainer.training_step; Iterations vs reportings: Clarification of terms. ( #25076 )
2022-06-10 17:09:18 +02:00
Artur Niederfahrenhorst
94d6c212df
[RLlib] Replay Buffer API documentation. ( #24683 )
2022-06-10 16:47:51 +02:00
Artur Niederfahrenhorst
c3645928ca
[RLlib] Fix no gradient clipping happening in QMix. ( #25656 )
2022-06-10 13:51:26 +02:00
Avnish Narayan
730df43656
[RLlib] Issue 25503: Replace torch.range with torch.arange. ( #25640 )
2022-06-10 13:21:54 +02:00
Artur Niederfahrenhorst
8af9ef8fee
[RLlib] Discussion 6432: Automatic train_batch_size
calculation fix. ( #25621 )
2022-06-10 12:15:57 +02:00
Artur Niederfahrenhorst
7495e9c89c
[RLlib] Dreamer Policy sub-classing schema. ( #25585 )
2022-06-09 17:14:15 +02:00
Artur Niederfahrenhorst
9226643433
[RLlib] Issue 4965: Fixes PyTorch grad clipping logic and adds grad clipping to QMIX. ( #25584 )
2022-06-08 19:40:57 +02:00
Sven Mika
388fb98c79
[RLlib] CRR Tests fixes. ( #25586 )
2022-06-08 19:18:55 +02:00
kourosh hakhamaneshi
4cdd508f70
[RLlib] Added CRR implementation. ( #25499 )
2022-06-08 11:42:02 +02:00
Rohan Potdar
a9d8da0100
[RLlib]: Doubly Robust Off-Policy Evaluation. ( #25056 )
2022-06-07 12:52:19 +02:00
Artur Niederfahrenhorst
35bd397181
[RLlib] Better default values for training_intensity
and target_network_update_freq
for R2D2. ( #25510 )
2022-06-07 10:29:56 +02:00
Vince Jankovics
68444cd390
[tune] Custom resources per worker added to default_resource_request ( #24463 )
...
This resolves the `TODO(ekl): add custom resources here once tune supports them` item.
Also, related to the discussion [here](https://discuss.ray.io/t/reserve-workers-on-gpu-node-for-trainer-workers-only/5972/5 ).
Co-authored-by: Kai Fricke <kai@anyscale.com>
2022-06-06 22:41:02 +01:00
Artur Niederfahrenhorst
5133978adc
[RLlib] PG policy subclassing conversion. ( #25288 )
2022-06-06 13:07:47 +02:00
Sven Mika
b5bc2b93c3
[RLlib] Move all remaining algos into algorithms
directory. ( #25366 )
2022-06-04 07:35:24 +02:00
Jun Gong
1d24d6af98
[RLlib] Fix MARWIL tf policy. ( #25384 )
2022-06-03 10:50:36 +02:00
Yi Cheng
fd0f967d2e
Revert "[RLlib] Move (A/DD)?PPO and IMPALA algos to algorithms
dir and rename policy and trainer classes. ( #25346 )" ( #25420 )
...
This reverts commit e4ceae19ef
.
Reverts #25346
linux://python/ray/tests:test_client_library_integration never fail before this PR.
In the CI of the reverted PR, it also fails (https://buildkite.com/ray-project/ray-builders-pr/builds/34079#01812442-c541-4145-af22-2a012655c128 ). So high likely it's because of this PR.
And test output failure seems related as well (https://buildkite.com/ray-project/ray-builders-branch/builds/7923#018125c2-4812-4ead-a42f-7fddb344105b )
2022-06-02 20:38:44 -07:00
Sven Mika
e4ceae19ef
[RLlib] Move (A/DD)?PPO and IMPALA algos to algorithms
dir and rename policy and trainer classes. ( #25346 )
2022-06-02 16:47:05 +02:00
Steven Morad
f781622f86
[RLlib] Bandits (torch) Policy sub-class. ( #25254 )
...
Co-authored-by: Steven Morad <smorad@anyscale.com>
2022-06-02 15:16:51 +02:00
Eric Liang
905258dbc1
Clean up docstyle in python modules and add LINT rule ( #25272 )
2022-06-01 11:27:54 -07:00
Sven Mika
18c03f8d93
[RLlib] A2C + A3C move to algorithms
folder and re-name into A2C/A3C (from ...Trainer). ( #25314 )
2022-06-01 09:29:16 +02:00
Sven Mika
94557e3095
[RLlib] Apex-DDPG TrainerConfig objects. ( #25279 )
2022-05-30 19:45:38 +02:00
Sven Mika
c5edd82c63
[RLlib] MB-MPO TrainerConfig objects. ( #25278 )
2022-05-30 17:33:01 +02:00
Sven Mika
f75ede1b81
[RLlib] MA-DDPG TrainerConfig objects. ( #25255 )
2022-05-30 15:38:24 +02:00
Sven Mika
30f6fc340b
[RLlib] AlphaZero TrainerConfig objects. ( #25256 )
2022-05-30 15:37:58 +02:00
Rohan Potdar
ab81c8e9ca
[RLlib]: Rename input_evaluation
to off_policy_estimation_methods
. ( #25107 )
2022-05-27 13:14:54 +02:00
Avnish Narayan
eaed256d68
[RLlib] Async parallel execution manager. ( #24423 )
2022-05-25 17:54:08 +02:00
Jun Gong
eaf9c941ae
[RLlib] Migrate PPO Impala and APPO policies to use sub-classing implementation. ( #25117 )
2022-05-25 14:38:03 +02:00
Artur Niederfahrenhorst
d76ef9add5
[RLLib] Fix RNNSAC example failing on CI + fixes for recurrent models for other Q Learning Algos. ( #24923 )
2022-05-24 14:39:43 +02:00
Sven Mika
e73c37cc17
[RLlib] MADDPG: Move into main algorithms
folder and add proper unit and learning tests. ( #24579 )
2022-05-24 12:53:53 +02:00
Sven Mika
ec89fe5203
[RLlib] APEX-DQN and R2D2 config objects. ( #25067 )
2022-05-23 12:15:45 +02:00