Eric Liang
43aa2299e6
[api] Annotate as public / move ray-core APIs to _private and add enforcement rule ( #25695 )
...
Enable checking of the ray core module, excluding serve, workflows, and tune, in ./ci/lint/check_api_annotations.py. This required moving many files to ray._private and associated fixes.
2022-06-21 15:13:29 -07:00
Sven Mika
96693055bd
[RLlib] More Trainer -> Algorithm renaming cleanups. ( #25869 )
2022-06-20 15:54:00 +02:00
Sven Mika
d90c6cfbd6
[RLlib] SimpleQ PolicyV2 (sub-classing). ( #25871 )
2022-06-17 20:12:16 +02:00
Artur Niederfahrenhorst
a322cc5765
[RLlib] IMPALA/APPO multi-agent mix-in-buffer fixes (plus MA learning tests). ( #25848 )
2022-06-17 14:10:36 +02:00
Yi Cheng
7b8b0f8e03
Revert "[RLlib] Remove execution plan code no longer used by RLlib. ( #25624 )" ( #25776 )
...
This reverts commit 804719876b
.
2022-06-14 13:59:15 -07:00
Jun Gong
c026374acb
[RLlib] Fix the 2 failing RLlib release tests. ( #25603 )
2022-06-14 14:51:08 +02:00
Avnish Narayan
804719876b
[RLlib] Remove execution plan code no longer used by RLlib. ( #25624 )
2022-06-14 10:57:27 +02:00
Sven Mika
130b7eeaba
[RLlib] Trainer
to Algorithm
renaming. ( #25539 )
2022-06-11 15:10:39 +02:00
Artur Niederfahrenhorst
94d6c212df
[RLlib] Replay Buffer API documentation. ( #24683 )
2022-06-10 16:47:51 +02:00
Artur Niederfahrenhorst
9226643433
[RLlib] Issue 4965: Fixes PyTorch grad clipping logic and adds grad clipping to QMIX. ( #25584 )
2022-06-08 19:40:57 +02:00
Jun Gong
9b65d5535d
[RLlib] Introduce basic connectors library. ( #25311 )
2022-06-07 19:18:14 +02:00
Artur Niederfahrenhorst
429d0f0eee
[RLlib] Fix multi agent environment checks for observations that contain only some agents' obs each step. ( #25506 )
2022-06-07 10:33:35 +02:00
Artur Niederfahrenhorst
5133978adc
[RLlib] PG policy subclassing conversion. ( #25288 )
2022-06-06 13:07:47 +02:00
Artur Niederfahrenhorst
c4a0e9d0f2
[RLlib] Disambiguate timestep fragment storage unit in replay buffers. ( #25242 )
2022-06-06 11:35:49 +02:00
Sven Mika
b5bc2b93c3
[RLlib] Move all remaining algos into algorithms
directory. ( #25366 )
2022-06-04 07:35:24 +02:00
Yi Cheng
fd0f967d2e
Revert "[RLlib] Move (A/DD)?PPO and IMPALA algos to algorithms
dir and rename policy and trainer classes. ( #25346 )" ( #25420 )
...
This reverts commit e4ceae19ef
.
Reverts #25346
linux://python/ray/tests:test_client_library_integration never fail before this PR.
In the CI of the reverted PR, it also fails (https://buildkite.com/ray-project/ray-builders-pr/builds/34079#01812442-c541-4145-af22-2a012655c128 ). So high likely it's because of this PR.
And test output failure seems related as well (https://buildkite.com/ray-project/ray-builders-branch/builds/7923#018125c2-4812-4ead-a42f-7fddb344105b )
2022-06-02 20:38:44 -07:00
Sven Mika
e4ceae19ef
[RLlib] Move (A/DD)?PPO and IMPALA algos to algorithms
dir and rename policy and trainer classes. ( #25346 )
2022-06-02 16:47:05 +02:00
Eric Liang
905258dbc1
Clean up docstyle in python modules and add LINT rule ( #25272 )
2022-06-01 11:27:54 -07:00
Sven Mika
18c03f8d93
[RLlib] A2C + A3C move to algorithms
folder and re-name into A2C/A3C (from ...Trainer). ( #25314 )
2022-06-01 09:29:16 +02:00
Sven Mika
c5edd82c63
[RLlib] MB-MPO TrainerConfig objects. ( #25278 )
2022-05-30 17:33:01 +02:00
Sven Mika
d95009a3ac
[RLlib] Vectorized envs: Gracefully handle sub-environments failing by restarting them (if configured so). ( #24967 )
2022-05-28 10:50:03 +02:00
Sven Mika
ab6c3027e5
[RLlib] A2/3C policy sub-classing schema. ( #25078 )
2022-05-28 09:54:47 +02:00
Jun Gong
eaf9c941ae
[RLlib] Migrate PPO Impala and APPO policies to use sub-classing implementation. ( #25117 )
2022-05-25 14:38:03 +02:00
Vasilios Mavroudis
edca96353f
[RLlib] Curiosity Bug Fix. ( #24880 )
2022-05-25 09:31:34 +02:00
Eric Liang
4963dfaae0
[api] Add API stability annotations for all RLlib symbols and add to LINT ( #25060 )
2022-05-24 22:14:25 -07:00
Jun Gong
93ff0beb4e
[RLlib] Introduce utils to serialize gym Spaces (and thus ViewRequirements). ( #25007 )
2022-05-24 21:12:20 +02:00
Artur Niederfahrenhorst
d76ef9add5
[RLLib] Fix RNNSAC example failing on CI + fixes for recurrent models for other Q Learning Algos. ( #24923 )
2022-05-24 14:39:43 +02:00
Sven Mika
ec89fe5203
[RLlib] APEX-DQN and R2D2 config objects. ( #25067 )
2022-05-23 12:15:45 +02:00
Sven Mika
09886d7ab8
[RLlib] Upgrade gym 0.23 ( #24171 )
2022-05-23 08:18:44 +02:00
Artur Niederfahrenhorst
cd16dc4dae
[RLlib] Fix estimated buffer size in replay buffers. ( #24848 )
2022-05-22 21:03:23 +02:00
Steven Morad
501d932449
[RLlib] SAC, RNNSAC, and CQL TrainerConfig objects ( #25059 )
2022-05-22 19:58:47 +02:00
Eric Liang
55d039af32
Annotate datasources and add API annotation check script ( #24999 )
...
Why are these changes needed?
Add API stability annotations for datasource classes, and add a linter to check all data classes have appropriate annotations.
2022-05-21 15:05:07 -07:00
kourosh hakhamaneshi
3815e52a61
[RLlib] Agents to algos: DQN w/o Apex and R2D2, DDPG/TD3, SAC, SlateQ, QMIX, PG, Bandits ( #24896 )
2022-05-19 18:30:42 +02:00
Jun Gong
dea134a472
[RLlib] Clean up Policy mixins. ( #24746 )
2022-05-17 17:16:08 +02:00
Artur Niederfahrenhorst
fb2915d26a
[RLlib] Replay Buffer API and Ape-X. ( #24506 )
2022-05-17 13:43:49 +02:00
Sven Mika
0cd7bc4054
[RLlib] Re-establish dashboard performance tests. ( #24728 )
2022-05-16 13:13:49 +02:00
Max Pumperla
6a6c58b5b4
[RLlib] Config objects for DDPG and SimpleQ. ( #24339 )
2022-05-12 16:12:42 +02:00
Artur Niederfahrenhorst
95d4a83a87
[RLlib] R2D2 Replay Buffer API integration. ( #24473 )
2022-05-10 20:36:14 +02:00
Sven Mika
44a51610c2
[RLlib] SlateQ config objects. ( #24577 )
2022-05-10 20:07:18 +02:00
Artur Niederfahrenhorst
8d906f9bf8
[RLlib] SAC with new Replay Buffer API. ( #24156 )
2022-05-09 14:33:02 +02:00
Artur Niederfahrenhorst
bd2fdf4752
[RLlib] Automate sequences in timeslice_along_seq_lens_with_overlap()
. ( #24561 )
2022-05-09 11:55:06 +02:00
Avnish Narayan
f2bb6f6806
[RLlib] Impala training iteration fn ( #23454 )
2022-05-05 16:11:08 +02:00
Artur Niederfahrenhorst
86bc9ecce2
[RLlib] DDPG Training iteration fn & Replay Buffer API ( #24212 )
2022-05-05 09:41:38 +02:00
Sven Mika
b48f63113b
[RLlib] SlateQ fixes: Release learning tests wrong yaml structure + TD-error torch issue ( #24429 )
2022-05-04 13:37:14 +02:00
Kai Fricke
7a4d58d80f
[rllib] Fix doctest failure ( #24343 )
...
Lint was still failing (but only caught with doctest):
```
File "../../python/ray/rllib/utils/numpy.py", line ?, in default
Failed example:
tree.traverse(make_action_immutable, d, top_down=False)
Exception raised:
Traceback (most recent call last):
File "/opt/miniconda/lib/python3.6/doctest.py", line 1330, in __run
compileflags, 1), test.globs)
File "<doctest default[4]>", line 1, in <module>
tree.traverse(make_action_immutable, d, top_down=False)
NameError: name 'make_action_immutable' is not defined
```
2022-04-29 19:13:24 +01:00
Sven Mika
539832f2c5
[RLlib] SlateQ training iteration function. ( #24151 )
2022-04-29 18:38:17 +02:00
Kai Fricke
242706922b
[rllib] Fix linting ( #24335 )
...
#24262 broke linting. This fixes this.
2022-04-29 15:21:11 +01:00
simonsays1980
ff575eeafc
[RLlib] Make actions sent by RLlib to the env immutable. ( #24262 )
2022-04-29 10:27:06 +02:00
Sven Mika
6551922c21
[RLlib] Fix AlphaStar for tf2+tracing; smaller cleanups around avoiding to wrap a TFPolicy as_eager()
or with_tracing
more than once. ( #24271 )
2022-04-28 13:43:21 +02:00
Sven Mika
627b9f2e88
[RLlib] QMIX training iteration function and new replay buffer API. ( #24164 )
2022-04-27 14:24:20 +02:00