Artur Niederfahrenhorst
|
8af9ef8fee
|
[RLlib] Discussion 6432: Automatic train_batch_size calculation fix. (#25621)
|
2022-06-10 12:15:57 +02:00 |
|
Artur Niederfahrenhorst
|
7495e9c89c
|
[RLlib] Dreamer Policy sub-classing schema. (#25585)
|
2022-06-09 17:14:15 +02:00 |
|
Artur Niederfahrenhorst
|
9226643433
|
[RLlib] Issue 4965: Fixes PyTorch grad clipping logic and adds grad clipping to QMIX. (#25584)
|
2022-06-08 19:40:57 +02:00 |
|
Sven Mika
|
388fb98c79
|
[RLlib] CRR Tests fixes. (#25586)
|
2022-06-08 19:18:55 +02:00 |
|
kourosh hakhamaneshi
|
4cdd508f70
|
[RLlib] Added CRR implementation. (#25499)
|
2022-06-08 11:42:02 +02:00 |
|
Rohan Potdar
|
a9d8da0100
|
[RLlib]: Doubly Robust Off-Policy Evaluation. (#25056)
|
2022-06-07 12:52:19 +02:00 |
|
Artur Niederfahrenhorst
|
35bd397181
|
[RLlib] Better default values for training_intensity and target_network_update_freq for R2D2. (#25510)
|
2022-06-07 10:29:56 +02:00 |
|
Vince Jankovics
|
68444cd390
|
[tune] Custom resources per worker added to default_resource_request (#24463)
This resolves the `TODO(ekl): add custom resources here once tune supports them` item.
Also, related to the discussion [here](https://discuss.ray.io/t/reserve-workers-on-gpu-node-for-trainer-workers-only/5972/5).
Co-authored-by: Kai Fricke <kai@anyscale.com>
|
2022-06-06 22:41:02 +01:00 |
|
Artur Niederfahrenhorst
|
5133978adc
|
[RLlib] PG policy subclassing conversion. (#25288)
|
2022-06-06 13:07:47 +02:00 |
|
Sven Mika
|
b5bc2b93c3
|
[RLlib] Move all remaining algos into algorithms directory. (#25366)
|
2022-06-04 07:35:24 +02:00 |
|
Jun Gong
|
1d24d6af98
|
[RLlib] Fix MARWIL tf policy. (#25384)
|
2022-06-03 10:50:36 +02:00 |
|
Yi Cheng
|
fd0f967d2e
|
Revert "[RLlib] Move (A/DD)?PPO and IMPALA algos to algorithms dir and rename policy and trainer classes. (#25346)" (#25420)
This reverts commit e4ceae19ef .
Reverts #25346
linux://python/ray/tests:test_client_library_integration never fail before this PR.
In the CI of the reverted PR, it also fails (https://buildkite.com/ray-project/ray-builders-pr/builds/34079#01812442-c541-4145-af22-2a012655c128). So high likely it's because of this PR.
And test output failure seems related as well (https://buildkite.com/ray-project/ray-builders-branch/builds/7923#018125c2-4812-4ead-a42f-7fddb344105b)
|
2022-06-02 20:38:44 -07:00 |
|
Sven Mika
|
e4ceae19ef
|
[RLlib] Move (A/DD)?PPO and IMPALA algos to algorithms dir and rename policy and trainer classes. (#25346)
|
2022-06-02 16:47:05 +02:00 |
|
Steven Morad
|
f781622f86
|
[RLlib] Bandits (torch) Policy sub-class. (#25254)
Co-authored-by: Steven Morad <smorad@anyscale.com>
|
2022-06-02 15:16:51 +02:00 |
|
Eric Liang
|
905258dbc1
|
Clean up docstyle in python modules and add LINT rule (#25272)
|
2022-06-01 11:27:54 -07:00 |
|
Sven Mika
|
18c03f8d93
|
[RLlib] A2C + A3C move to algorithms folder and re-name into A2C/A3C (from ...Trainer). (#25314)
|
2022-06-01 09:29:16 +02:00 |
|
Sven Mika
|
94557e3095
|
[RLlib] Apex-DDPG TrainerConfig objects. (#25279)
|
2022-05-30 19:45:38 +02:00 |
|
Sven Mika
|
c5edd82c63
|
[RLlib] MB-MPO TrainerConfig objects. (#25278)
|
2022-05-30 17:33:01 +02:00 |
|
Sven Mika
|
f75ede1b81
|
[RLlib] MA-DDPG TrainerConfig objects. (#25255)
|
2022-05-30 15:38:24 +02:00 |
|
Sven Mika
|
30f6fc340b
|
[RLlib] AlphaZero TrainerConfig objects. (#25256)
|
2022-05-30 15:37:58 +02:00 |
|
Rohan Potdar
|
ab81c8e9ca
|
[RLlib]: Rename input_evaluation to off_policy_estimation_methods . (#25107)
|
2022-05-27 13:14:54 +02:00 |
|
Avnish Narayan
|
eaed256d68
|
[RLlib] Async parallel execution manager. (#24423)
|
2022-05-25 17:54:08 +02:00 |
|
Jun Gong
|
eaf9c941ae
|
[RLlib] Migrate PPO Impala and APPO policies to use sub-classing implementation. (#25117)
|
2022-05-25 14:38:03 +02:00 |
|
Artur Niederfahrenhorst
|
d76ef9add5
|
[RLLib] Fix RNNSAC example failing on CI + fixes for recurrent models for other Q Learning Algos. (#24923)
|
2022-05-24 14:39:43 +02:00 |
|
Sven Mika
|
e73c37cc17
|
[RLlib] MADDPG: Move into main algorithms folder and add proper unit and learning tests. (#24579)
|
2022-05-24 12:53:53 +02:00 |
|
Sven Mika
|
ec89fe5203
|
[RLlib] APEX-DQN and R2D2 config objects. (#25067)
|
2022-05-23 12:15:45 +02:00 |
|
Sven Mika
|
dea9b86a16
|
[RLlib] MAML config objects. (#25066)
|
2022-05-23 10:14:24 +02:00 |
|
Sven Mika
|
baf8c2fa1e
|
[RLlib] TD3 config objects. (#25065)
|
2022-05-23 10:07:13 +02:00 |
|
Sven Mika
|
09886d7ab8
|
[RLlib] Upgrade gym 0.23 (#24171)
|
2022-05-23 08:18:44 +02:00 |
|
Steven Morad
|
501d932449
|
[RLlib] SAC, RNNSAC, and CQL TrainerConfig objects (#25059)
|
2022-05-22 19:58:47 +02:00 |
|
Rohan Potdar
|
5a70b732e8
|
[RLlib] MARWIL and BC Config. (#24853)
|
2022-05-21 12:50:20 +02:00 |
|
Jun Gong
|
d5a6d46049
|
[RLlib] Migrate MAML, MB-MPO, MARWIL, and BC to use Policy sub-classing implementation. (#24914)
|
2022-05-20 14:10:59 +02:00 |
|
Kai Fricke
|
3e053c85ee
|
[RLlib] Fix broken links from agent -> algo conversion. (#25014)
|
2022-05-20 11:37:11 +02:00 |
|
kourosh hakhamaneshi
|
3815e52a61
|
[RLlib] Agents to algos: DQN w/o Apex and R2D2, DDPG/TD3, SAC, SlateQ, QMIX, PG, Bandits (#24896)
|
2022-05-19 18:30:42 +02:00 |
|
Sven Mika
|
8f50087908
|
[RLlib] AlphaZero uses training_iteration API. (#24507)
|
2022-05-18 09:58:25 +02:00 |
|
Jun Gong
|
dea134a472
|
[RLlib] Clean up Policy mixins. (#24746)
|
2022-05-17 17:16:08 +02:00 |
|
Artur Niederfahrenhorst
|
fb2915d26a
|
[RLlib] Replay Buffer API and Ape-X. (#24506)
|
2022-05-17 13:43:49 +02:00 |
|
Sven Mika
|
25001f6d8d
|
[RLlib] APPO Training iteration fn. (#24545)
|
2022-05-17 10:31:07 +02:00 |
|
Sven Mika
|
0cd7bc4054
|
[RLlib] Re-establish dashboard performance tests. (#24728)
|
2022-05-16 13:13:49 +02:00 |
|
Kai Fricke
|
96da5dc776
|
[rllib] Fix some missing agent->algorithm doc changes (#24841)
#24797 missed some doc changes that popped up in broken linkcheck. Note that there could be others that were not caught by this.
|
2022-05-16 11:52:49 +01:00 |
|
Jun Gong
|
68a9a33386
|
[RLlib] Retry agents -> algorithms. with proper doc changes this time. (#24797)
|
2022-05-16 09:45:32 +02:00 |
|
Simon Mo
|
9f23affdc0
|
[Hotfix] Unbreak lint in master (#24794)
|
2022-05-13 15:05:05 -07:00 |
|
kourosh hakhamaneshi
|
ffcbb30552
|
[RLlib] Move from agents to algorithms - CQL, MARWIL, AlphaStar, MAML, Dreamer, MBMPO. (#24739)
|
2022-05-13 18:43:36 +02:00 |
|
kourosh hakhamaneshi
|
69055f556d
|
[RLlib] Move agents.ars to algorithms.ars . (#24516)
|
2022-05-06 19:11:15 +02:00 |
|
kourosh hakhamaneshi
|
f48f1b252c
|
[RLlib] Moved agents.es to algorithms.es (#24511)
|
2022-05-06 14:54:22 +02:00 |
|