Commit graph

343 commits

Author SHA1 Message Date
Jun Gong
54df8bfe42
[RLlib] Try to checkpoint a durable policy name (#27016) 2022-07-27 00:01:14 -07:00
kourosh hakhamaneshi
5030a4c1d3
[RLlib] Simplify agent collector (#26803) 2022-07-25 13:17:17 -07:00
Avnish Narayan
41c9ef709a
[RLlib] Using PG when not doing microbatching kills A2C performance. (#26844) 2022-07-25 15:11:26 +02:00
Jun Gong
0bc560bd54
[RLlib] Make sure we step() after adding init_obs. (#26827) 2022-07-21 20:43:46 -07:00
Riatre
591cd22be7
Revert "Revert "Bump pytest from 5.4.3 to 7.0.1"" (#26525)
* Revert "Revert "Bump pytest from 5.4.3 to 7.0.1""

This reverts commit ab10890e90.

Signed-off-by: Riatre Foo <foo@riat.re>

* Fix missing test data files dependency in rllib/BUILD

See # 26334 and # 26517 for context.

Once this is in, it should be good to roll-forwrad again.

Signed-off-by: Riatre Foo <foo@riat.re>

* debug: run all tests

Signed-off-by: Riatre Foo <foo@riat.re>

* Revert "debug: run all tests"

This reverts commit 0c5e796b0eb437d64922f66749c61b0412486970.

Signed-off-by: Riatre Foo <foo@riat.re>

* fix new tests since last rebase

Signed-off-by: Riatre Foo <foo@riat.re>
2022-07-18 21:21:19 -07:00
Rohan Potdar
38c9e1d52a
[RLlib]: Fix OPE trainables (#26279)
Co-authored-by: Kourosh Hakhamaneshi <kourosh@anyscale.com>
2022-07-17 14:25:53 -07:00
kourosh hakhamaneshi
569fe01096
[RLlib] improved unittests for dataset_reader and fixed bugs (#26458) 2022-07-17 13:38:15 -07:00
Sven Mika
4aea24c8a8
[RLlib] restart_failed_sub_environments now works for MA cases and crashes during reset(); +more tests and logging; add eval worker sub-env fault tolerance test. (#26276) 2022-07-15 08:55:14 +02:00
kourosh hakhamaneshi
be6e4c644f
[RLlib] Feature importance evaluation for offline RL (#26412) 2022-07-11 18:12:50 -07:00
Jun Gong
0c469e490e
[RLlib] Checkpoint and restore connectors. (#26253) 2022-07-09 01:06:24 -07:00
Avnish Narayan
1243ed62bf
[RLlib] Make Dataset reader default reader and enable CRR to use dataset (#26304)
Co-authored-by: avnish <avnish@avnishs-MBP.local.meter>
2022-07-08 12:43:35 -07:00
Jun Gong
52bb8e47d4
[RLlib] EnvRunnerV2 and EpisodeV2 that support Connectors. (#25922) 2022-06-30 08:44:10 +02:00
Avnish Narayan
1f9282a496
[RLlib, Offline] Make the dataset and json readers batchable (#26055)
Make the dataset and json readers batchable.
2022-06-29 11:52:40 -07:00
kourosh hakhamaneshi
f421730b47
[RLlib] Added expectation advantage_type option to CRR. (#26142) 2022-06-28 15:40:09 +02:00
Sven Mika
762cfbdff1
[RLlib] IMPALA and APPO metrics fixes; remove deprecated async_parallel_requests utility. (#26117) 2022-06-28 15:14:37 +02:00
Jun Gong
8c9cac350d
Fix unit test test_check_env.py and est_check_multi_agent.py. (#25993) 2022-06-23 22:55:41 -07:00
Sven Mika
96693055bd
[RLlib] More Trainer -> Algorithm renaming cleanups. (#25869) 2022-06-20 15:54:00 +02:00
Artur Niederfahrenhorst
a322cc5765
[RLlib] IMPALA/APPO multi-agent mix-in-buffer fixes (plus MA learning tests). (#25848) 2022-06-17 14:10:36 +02:00
Artur Niederfahrenhorst
f34cd2fd8f
[RLlib] Take replay buffer api example out of GPU examples. (#25841) 2022-06-16 19:12:38 +02:00
Yi Cheng
7b8b0f8e03
Revert "[RLlib] Remove execution plan code no longer used by RLlib. (#25624)" (#25776)
This reverts commit 804719876b.
2022-06-14 13:59:15 -07:00
Avnish Narayan
804719876b
[RLlib] Remove execution plan code no longer used by RLlib. (#25624) 2022-06-14 10:57:27 +02:00
Kai Fricke
736c7b13c4
[CI] Fix team to rllib (from ml) for some replay buffer API tests. (#25702) 2022-06-11 18:05:16 +02:00
Sven Mika
130b7eeaba
[RLlib] Trainer to Algorithm renaming. (#25539) 2022-06-11 15:10:39 +02:00
Sven Mika
7c39aa5fac
[RLlib] Trainer.training_iteration -> Trainer.training_step; Iterations vs reportings: Clarification of terms. (#25076) 2022-06-10 17:09:18 +02:00
Artur Niederfahrenhorst
94d6c212df
[RLlib] Replay Buffer API documentation. (#24683) 2022-06-10 16:47:51 +02:00
Kai Fricke
aa142eb377
[RLlib; CI] Add team:rllib tag for Bazel. (#25589)
Currently, team:ml spans all ML (Tune, Train, AIR) tests and rllib tests. rllib tests are much more flaky and it would be good to split them up in the flaky test tracker. This PR changes Rllib-tests from team:ml to team:rllib to enable this separation.

Co-authored-by: Antoni Baum <antoni.baum@protonmail.com>
2022-06-08 22:25:59 +01:00
Sven Mika
388fb98c79
[RLlib] CRR Tests fixes. (#25586) 2022-06-08 19:18:55 +02:00
kourosh hakhamaneshi
4cdd508f70
[RLlib] Added CRR implementation. (#25499) 2022-06-08 11:42:02 +02:00
Jun Gong
9b65d5535d
[RLlib] Introduce basic connectors library. (#25311) 2022-06-07 19:18:14 +02:00
Rohan Potdar
a9d8da0100
[RLlib]: Doubly Robust Off-Policy Evaluation. (#25056) 2022-06-07 12:52:19 +02:00
Vince Jankovics
68444cd390
[tune] Custom resources per worker added to default_resource_request (#24463)
This resolves the `TODO(ekl): add custom resources here once tune supports them` item. 
Also, related to the discussion [here](https://discuss.ray.io/t/reserve-workers-on-gpu-node-for-trainer-workers-only/5972/5).

Co-authored-by: Kai Fricke <kai@anyscale.com>
2022-06-06 22:41:02 +01:00
Jun Gong
644b80c0ef
[RLlib] mark learning and examples tests exclusive. (#25445) 2022-06-04 09:35:24 -07:00
Sven Mika
b5bc2b93c3
[RLlib] Move all remaining algos into algorithms directory. (#25366) 2022-06-04 07:35:24 +02:00
Yi Cheng
fd0f967d2e
Revert "[RLlib] Move (A/DD)?PPO and IMPALA algos to algorithms dir and rename policy and trainer classes. (#25346)" (#25420)
This reverts commit e4ceae19ef.

Reverts #25346

linux://python/ray/tests:test_client_library_integration never fail before this PR.

In the CI of the reverted PR, it also fails (https://buildkite.com/ray-project/ray-builders-pr/builds/34079#01812442-c541-4145-af22-2a012655c128). So high likely it's because of this PR.

And test output failure seems related as well (https://buildkite.com/ray-project/ray-builders-branch/builds/7923#018125c2-4812-4ead-a42f-7fddb344105b)
2022-06-02 20:38:44 -07:00
Sven Mika
e4ceae19ef
[RLlib] Move (A/DD)?PPO and IMPALA algos to algorithms dir and rename policy and trainer classes. (#25346) 2022-06-02 16:47:05 +02:00
Antoni Baum
045c47f172
[CI] Check test files for if __name__... snippet (#25322)
Bazel operates by simply running the python scripts given to it in `py_test`. If the script doesn't invoke pytest on itself in the `if _name__ == "__main__"` snippet, no tests will be ran, and the script will pass. This has led to several tests (indeed, some are fixed in this PR) that, despite having been written, have never ran in CI. This PR adds a lint check to check all `py_test` sources for the presence of `if _name__ == "__main__"` snippet, and will fail CI if there are any detected without it. This system is only enabled for libraries right now (tune, train, air, rllib), but it could be trivially extended to other modules if approved.
2022-06-02 10:30:00 +01:00
Sven Mika
18c03f8d93
[RLlib] A2C + A3C move to algorithms folder and re-name into A2C/A3C (from ...Trainer). (#25314) 2022-06-01 09:29:16 +02:00
Sven Mika
d95009a3ac
[RLlib] Vectorized envs: Gracefully handle sub-environments failing by restarting them (if configured so). (#24967) 2022-05-28 10:50:03 +02:00
Sven Mika
163fa81976
[RLlib] Discussion 6060 and 5120: auto-infer different agents' spaces in multi-agent env. (#24649) 2022-05-27 14:56:24 +02:00
Rohan Potdar
ab81c8e9ca
[RLlib]: Rename input_evaluation to off_policy_estimation_methods. (#25107) 2022-05-27 13:14:54 +02:00
Avnish Narayan
eaed256d68
[RLlib] Async parallel execution manager. (#24423) 2022-05-25 17:54:08 +02:00
Jun Gong
93ff0beb4e
[RLlib] Introduce utils to serialize gym Spaces (and thus ViewRequirements). (#25007) 2022-05-24 21:12:20 +02:00
Sven Mika
e73c37cc17
[RLlib] MADDPG: Move into main algorithms folder and add proper unit and learning tests. (#24579) 2022-05-24 12:53:53 +02:00
Sven Mika
09886d7ab8
[RLlib] Upgrade gym 0.23 (#24171) 2022-05-23 08:18:44 +02:00
kourosh hakhamaneshi
3815e52a61
[RLlib] Agents to algos: DQN w/o Apex and R2D2, DDPG/TD3, SAC, SlateQ, QMIX, PG, Bandits (#24896) 2022-05-19 18:30:42 +02:00
Sven Mika
8f50087908
[RLlib] AlphaZero uses training_iteration API. (#24507) 2022-05-18 09:58:25 +02:00
Artur Niederfahrenhorst
fb2915d26a
[RLlib] Replay Buffer API and Ape-X. (#24506) 2022-05-17 13:43:49 +02:00
Sven Mika
0cd7bc4054
[RLlib] Re-establish dashboard performance tests. (#24728) 2022-05-16 13:13:49 +02:00
Jun Gong
68a9a33386
[RLlib] Retry agents -> algorithms. with proper doc changes this time. (#24797) 2022-05-16 09:45:32 +02:00
Steven Morad
6321c3a85c
[RLlib] Simple-Q TrainerConfig (#24583) 2022-05-15 17:24:01 +02:00