1
0
Fork 0
mirror of https://github.com/vale981/ray synced 2025-03-13 22:56:38 -04:00
Commit graph

677 commits

Author SHA1 Message Date
Sven Mika
96693055bd
[RLlib] More Trainer -> Algorithm renaming cleanups. () 2022-06-20 15:54:00 +02:00
Sven Mika
d90c6cfbd6
[RLlib] SimpleQ PolicyV2 (sub-classing). () 2022-06-17 20:12:16 +02:00
Sven Mika
130b7eeaba
[RLlib] Trainer to Algorithm renaming. () 2022-06-11 15:10:39 +02:00
Sven Mika
7c39aa5fac
[RLlib] Trainer.training_iteration -> Trainer.training_step; Iterations vs reportings: Clarification of terms. () 2022-06-10 17:09:18 +02:00
kourosh hakhamaneshi
4cdd508f70
[RLlib] Added CRR implementation. () 2022-06-08 11:42:02 +02:00
Rohan Potdar
a9d8da0100
[RLlib]: Doubly Robust Off-Policy Evaluation. () 2022-06-07 12:52:19 +02:00
Vince Jankovics
68444cd390
[tune] Custom resources per worker added to default_resource_request ()
This resolves the `TODO(ekl): add custom resources here once tune supports them` item. 
Also, related to the discussion [here](https://discuss.ray.io/t/reserve-workers-on-gpu-node-for-trainer-workers-only/5972/5).

Co-authored-by: Kai Fricke <kai@anyscale.com>
2022-06-06 22:41:02 +01:00
Artur Niederfahrenhorst
5133978adc
[RLlib] PG policy subclassing conversion. () 2022-06-06 13:07:47 +02:00
Sven Mika
b5bc2b93c3
[RLlib] Move all remaining algos into algorithms directory. () 2022-06-04 07:35:24 +02:00
Yi Cheng
fd0f967d2e
Revert "[RLlib] Move (A/DD)?PPO and IMPALA algos to algorithms dir and rename policy and trainer classes. ()" ()
This reverts commit e4ceae19ef.

Reverts 

linux://python/ray/tests:test_client_library_integration never fail before this PR.

In the CI of the reverted PR, it also fails (https://buildkite.com/ray-project/ray-builders-pr/builds/34079#01812442-c541-4145-af22-2a012655c128). So high likely it's because of this PR.

And test output failure seems related as well (https://buildkite.com/ray-project/ray-builders-branch/builds/7923#018125c2-4812-4ead-a42f-7fddb344105b)
2022-06-02 20:38:44 -07:00
Sven Mika
e4ceae19ef
[RLlib] Move (A/DD)?PPO and IMPALA algos to algorithms dir and rename policy and trainer classes. () 2022-06-02 16:47:05 +02:00
Eric Liang
905258dbc1
Clean up docstyle in python modules and add LINT rule () 2022-06-01 11:27:54 -07:00
Sven Mika
18c03f8d93
[RLlib] A2C + A3C move to algorithms folder and re-name into A2C/A3C (from ...Trainer). () 2022-06-01 09:29:16 +02:00
Sven Mika
94557e3095
[RLlib] Apex-DDPG TrainerConfig objects. () 2022-05-30 19:45:38 +02:00
Sven Mika
d95009a3ac
[RLlib] Vectorized envs: Gracefully handle sub-environments failing by restarting them (if configured so). () 2022-05-28 10:50:03 +02:00
Sven Mika
ab6c3027e5
[RLlib] A2/3C policy sub-classing schema. () 2022-05-28 09:54:47 +02:00
Rohan Potdar
ab81c8e9ca
[RLlib]: Rename input_evaluation to off_policy_estimation_methods. () 2022-05-27 13:14:54 +02:00
Avnish Narayan
eaed256d68
[RLlib] Async parallel execution manager. () 2022-05-25 17:54:08 +02:00
Kai Fricke
67cd984b92
[tune] Add annotations/set scope for Tune classes ()
This PR adds API annotations or changes the scope of several Ray Tune library classes.
2022-05-25 15:21:28 +02:00
Jun Gong
eaf9c941ae
[RLlib] Migrate PPO Impala and APPO policies to use sub-classing implementation. () 2022-05-25 14:38:03 +02:00
Eric Liang
4963dfaae0
[api] Add API stability annotations for all RLlib symbols and add to LINT () 2022-05-24 22:14:25 -07:00
Artur Niederfahrenhorst
d76ef9add5
[RLLib] Fix RNNSAC example failing on CI + fixes for recurrent models for other Q Learning Algos. () 2022-05-24 14:39:43 +02:00
Sven Mika
e73c37cc17
[RLlib] MADDPG: Move into main algorithms folder and add proper unit and learning tests. () 2022-05-24 12:53:53 +02:00
Sven Mika
4e99a57bab
[RLlib] Add @OverrideToImplementCustomLogic decorators to some Trainer class methods. () 2022-05-24 11:30:50 +02:00
Sven Mika
ec89fe5203
[RLlib] APEX-DQN and R2D2 config objects. () 2022-05-23 12:15:45 +02:00
Sven Mika
baf8c2fa1e
[RLlib] TD3 config objects. () 2022-05-23 10:07:13 +02:00
Sven Mika
09886d7ab8
[RLlib] Upgrade gym 0.23 () 2022-05-23 08:18:44 +02:00
Steven Morad
501d932449
[RLlib] SAC, RNNSAC, and CQL TrainerConfig objects () 2022-05-22 19:58:47 +02:00
Sven Mika
44773e810b
[RLlib] DD-PPO Config objects. () 2022-05-22 13:05:24 +02:00
Eric Liang
55d039af32
Annotate datasources and add API annotation check script ()
Why are these changes needed?
Add API stability annotations for datasource classes, and add a linter to check all data classes have appropriate annotations.
2022-05-21 15:05:07 -07:00
Rohan Potdar
5a70b732e8
[RLlib] MARWIL and BC Config. () 2022-05-21 12:50:20 +02:00
Jun Gong
d5a6d46049
[RLlib] Migrate MAML, MB-MPO, MARWIL, and BC to use Policy sub-classing implementation. () 2022-05-20 14:10:59 +02:00
Kai Fricke
3e053c85ee
[RLlib] Fix broken links from agent -> algo conversion. () 2022-05-20 11:37:11 +02:00
kourosh hakhamaneshi
3815e52a61
[RLlib] Agents to algos: DQN w/o Apex and R2D2, DDPG/TD3, SAC, SlateQ, QMIX, PG, Bandits () 2022-05-19 18:30:42 +02:00
Sven Mika
628ee4b5f0
[RLlib] Bandit tf2 fix (+ add tf2 to test cases). () 2022-05-18 18:58:42 +02:00
Sven Mika
8f50087908
[RLlib] AlphaZero uses training_iteration API. () 2022-05-18 09:58:25 +02:00
Jun Gong
dea134a472
[RLlib] Clean up Policy mixins. () 2022-05-17 17:16:08 +02:00
Artur Niederfahrenhorst
c2a1e5abd1
[RLlib] Prioritized Replay (if required) in SimpleQ and DDPG. () 2022-05-17 13:53:07 +02:00
Artur Niederfahrenhorst
fb2915d26a
[RLlib] Replay Buffer API and Ape-X. () 2022-05-17 13:43:49 +02:00
Sven Mika
25001f6d8d
[RLlib] APPO Training iteration fn. () 2022-05-17 10:31:07 +02:00
Sven Mika
0cd7bc4054
[RLlib] Re-establish dashboard performance tests. () 2022-05-16 13:13:49 +02:00
Jun Gong
68a9a33386
[RLlib] Retry agents -> algorithms. with proper doc changes this time. () 2022-05-16 09:45:32 +02:00
Steven Morad
5c96e7223b
[RLlib] SimpleQ (minor cleanups) and DQN TrainerConfig objects. () 2022-05-15 16:14:43 +02:00
Simon Mo
9f23affdc0
[Hotfix] Unbreak lint in master () 2022-05-13 15:05:05 -07:00
Sven Mika
8fe3fd8f7b
[RLlib] QMix TrainerConfig objects. () 2022-05-13 18:50:28 +02:00
kourosh hakhamaneshi
ffcbb30552
[RLlib] Move from agents to algorithms - CQL, MARWIL, AlphaStar, MAML, Dreamer, MBMPO. () 2022-05-13 18:43:36 +02:00
Steven Morad
ebe6ab0afc
[RLlib] Bandits use TrainerConfig objects. () 2022-05-12 22:02:15 +02:00
Max Pumperla
6a6c58b5b4
[RLlib] Config objects for DDPG and SimpleQ. () 2022-05-12 16:12:42 +02:00
Artur Niederfahrenhorst
95d4a83a87
[RLlib] R2D2 Replay Buffer API integration. () 2022-05-10 20:36:14 +02:00
Sven Mika
44a51610c2
[RLlib] SlateQ config objects. () 2022-05-10 20:07:18 +02:00