Jun Gong
644b80c0ef
[RLlib] mark learning and examples tests exclusive. ( #25445 )
2022-06-04 09:35:24 -07:00
Sven Mika
b5bc2b93c3
[RLlib] Move all remaining algos into algorithms
directory. ( #25366 )
2022-06-04 07:35:24 +02:00
Sven Mika
6c7f781d8e
[RLlib] Unflake some CI-tests. ( #25313 )
2022-06-03 14:51:50 +02:00
Jun Gong
1d24d6af98
[RLlib] Fix MARWIL tf policy. ( #25384 )
2022-06-03 10:50:36 +02:00
Yi Cheng
fd0f967d2e
Revert "[RLlib] Move (A/DD)?PPO and IMPALA algos to algorithms
dir and rename policy and trainer classes. ( #25346 )" ( #25420 )
...
This reverts commit e4ceae19ef
.
Reverts #25346
linux://python/ray/tests:test_client_library_integration never fail before this PR.
In the CI of the reverted PR, it also fails (https://buildkite.com/ray-project/ray-builders-pr/builds/34079#01812442-c541-4145-af22-2a012655c128 ). So high likely it's because of this PR.
And test output failure seems related as well (https://buildkite.com/ray-project/ray-builders-branch/builds/7923#018125c2-4812-4ead-a42f-7fddb344105b )
2022-06-02 20:38:44 -07:00
Sven Mika
e4ceae19ef
[RLlib] Move (A/DD)?PPO and IMPALA algos to algorithms
dir and rename policy and trainer classes. ( #25346 )
2022-06-02 16:47:05 +02:00
Steven Morad
f781622f86
[RLlib] Bandits (torch) Policy sub-class. ( #25254 )
...
Co-authored-by: Steven Morad <smorad@anyscale.com>
2022-06-02 15:16:51 +02:00
Antoni Baum
045c47f172
[CI] Check test files for if __name__...
snippet ( #25322 )
...
Bazel operates by simply running the python scripts given to it in `py_test`. If the script doesn't invoke pytest on itself in the `if _name__ == "__main__"` snippet, no tests will be ran, and the script will pass. This has led to several tests (indeed, some are fixed in this PR) that, despite having been written, have never ran in CI. This PR adds a lint check to check all `py_test` sources for the presence of `if _name__ == "__main__"` snippet, and will fail CI if there are any detected without it. This system is only enabled for libraries right now (tune, train, air, rllib), but it could be trivially extended to other modules if approved.
2022-06-02 10:30:00 +01:00
Artur Niederfahrenhorst
71a8a443ce
[RLlib] Fix Policy global timesteps being off by init sample batch size. ( #25349 )
2022-06-02 10:19:21 +02:00
kourosh hakhamaneshi
87c9fdd0f8
RLlib: Fix bug: WorkerSet.stop()
will raise error if self._local_worker
is None (e.g. in evaluation worker sets). ( #25332 )
2022-06-02 09:41:43 +02:00
Eric Liang
905258dbc1
Clean up docstyle in python modules and add LINT rule ( #25272 )
2022-06-01 11:27:54 -07:00
Sven Mika
18c03f8d93
[RLlib] A2C + A3C move to algorithms
folder and re-name into A2C/A3C (from ...Trainer). ( #25314 )
2022-06-01 09:29:16 +02:00
Sven Mika
94557e3095
[RLlib] Apex-DDPG TrainerConfig objects. ( #25279 )
2022-05-30 19:45:38 +02:00
Sven Mika
c5edd82c63
[RLlib] MB-MPO TrainerConfig objects. ( #25278 )
2022-05-30 17:33:01 +02:00
Sven Mika
f75ede1b81
[RLlib] MA-DDPG TrainerConfig objects. ( #25255 )
2022-05-30 15:38:24 +02:00
Sven Mika
30f6fc340b
[RLlib] AlphaZero TrainerConfig objects. ( #25256 )
2022-05-30 15:37:58 +02:00
Sven Mika
d95009a3ac
[RLlib] Vectorized envs: Gracefully handle sub-environments failing by restarting them (if configured so). ( #24967 )
2022-05-28 10:50:03 +02:00
Sven Mika
ab6c3027e5
[RLlib] A2/3C policy sub-classing schema. ( #25078 )
2022-05-28 09:54:47 +02:00
Sven Mika
163fa81976
[RLlib] Discussion 6060 and 5120: auto-infer different agents' spaces in multi-agent env. ( #24649 )
2022-05-27 14:56:24 +02:00
Rohan Potdar
ab81c8e9ca
[RLlib]: Rename input_evaluation
to off_policy_estimation_methods
. ( #25107 )
2022-05-27 13:14:54 +02:00
kourosh hakhamaneshi
9684ea3af6
[RLlib] Fix TorchPolicyV2 bug. ( #25203 )
2022-05-26 20:49:26 +02:00
Avnish Narayan
eaed256d68
[RLlib] Async parallel execution manager. ( #24423 )
2022-05-25 17:54:08 +02:00
Kai Fricke
67cd984b92
[tune] Add annotations/set scope for Tune classes ( #25077 )
...
This PR adds API annotations or changes the scope of several Ray Tune library classes.
2022-05-25 15:21:28 +02:00
Jun Gong
eaf9c941ae
[RLlib] Migrate PPO Impala and APPO policies to use sub-classing implementation. ( #25117 )
2022-05-25 14:38:03 +02:00
Vasilios Mavroudis
edca96353f
[RLlib] Curiosity Bug Fix. ( #24880 )
2022-05-25 09:31:34 +02:00
Eric Liang
4963dfaae0
[api] Add API stability annotations for all RLlib symbols and add to LINT ( #25060 )
2022-05-24 22:14:25 -07:00
Jun Gong
93ff0beb4e
[RLlib] Introduce utils to serialize gym Spaces (and thus ViewRequirements). ( #25007 )
2022-05-24 21:12:20 +02:00
Artur Niederfahrenhorst
d76ef9add5
[RLLib] Fix RNNSAC example failing on CI + fixes for recurrent models for other Q Learning Algos. ( #24923 )
2022-05-24 14:39:43 +02:00
Sven Mika
e73c37cc17
[RLlib] MADDPG: Move into main algorithms
folder and add proper unit and learning tests. ( #24579 )
2022-05-24 12:53:53 +02:00
Sven Mika
4e99a57bab
[RLlib] Add @OverrideToImplementCustomLogic
decorators to some Trainer
class methods. ( #24684 )
2022-05-24 11:30:50 +02:00
Sven Mika
ec89fe5203
[RLlib] APEX-DQN and R2D2 config objects. ( #25067 )
2022-05-23 12:15:45 +02:00
Sven Mika
dea9b86a16
[RLlib] MAML config objects. ( #25066 )
2022-05-23 10:14:24 +02:00
Sven Mika
baf8c2fa1e
[RLlib] TD3 config objects. ( #25065 )
2022-05-23 10:07:13 +02:00
Sven Mika
09886d7ab8
[RLlib] Upgrade gym 0.23 ( #24171 )
2022-05-23 08:18:44 +02:00
Artur Niederfahrenhorst
cd16dc4dae
[RLlib] Fix estimated buffer size in replay buffers. ( #24848 )
2022-05-22 21:03:23 +02:00
Steven Morad
501d932449
[RLlib] SAC, RNNSAC, and CQL TrainerConfig objects ( #25059 )
2022-05-22 19:58:47 +02:00
Sven Mika
44773e810b
[RLlib] DD-PPO Config objects. ( #25028 )
2022-05-22 13:05:24 +02:00
Eric Liang
55d039af32
Annotate datasources and add API annotation check script ( #24999 )
...
Why are these changes needed?
Add API stability annotations for datasource classes, and add a linter to check all data classes have appropriate annotations.
2022-05-21 15:05:07 -07:00
Rohan Potdar
5a70b732e8
[RLlib] MARWIL and BC Config. ( #24853 )
2022-05-21 12:50:20 +02:00
Jun Gong
d5a6d46049
[RLlib] Migrate MAML, MB-MPO, MARWIL, and BC to use Policy sub-classing implementation. ( #24914 )
2022-05-20 14:10:59 +02:00
Kai Fricke
3e053c85ee
[RLlib] Fix broken links from agent -> algo conversion. ( #25014 )
2022-05-20 11:37:11 +02:00
kourosh hakhamaneshi
3815e52a61
[RLlib] Agents to algos: DQN w/o Apex and R2D2, DDPG/TD3, SAC, SlateQ, QMIX, PG, Bandits ( #24896 )
2022-05-19 18:30:42 +02:00
Sven Mika
628ee4b5f0
[RLlib] Bandit tf2 fix (+ add tf2 to test cases). ( #24908 )
2022-05-18 18:58:42 +02:00
Sven Mika
8f50087908
[RLlib] AlphaZero uses training_iteration API. ( #24507 )
2022-05-18 09:58:25 +02:00
Nathan Matare
012a4c8667
[RLlib] Allow passing **kwargs to action distribution. ( #24692 )
2022-05-18 09:22:37 +02:00
Jun Gong
dea134a472
[RLlib] Clean up Policy mixins. ( #24746 )
2022-05-17 17:16:08 +02:00
Artur Niederfahrenhorst
c2a1e5abd1
[RLlib] Prioritized Replay (if required) in SimpleQ and DDPG. ( #24866 )
2022-05-17 13:53:07 +02:00
Artur Niederfahrenhorst
fb2915d26a
[RLlib] Replay Buffer API and Ape-X. ( #24506 )
2022-05-17 13:43:49 +02:00
Sven Mika
25001f6d8d
[RLlib] APPO Training iteration fn. ( #24545 )
2022-05-17 10:31:07 +02:00
Sven Mika
0cd7bc4054
[RLlib] Re-establish dashboard performance tests. ( #24728 )
2022-05-16 13:13:49 +02:00