hiro/ray - Forgejo: Beyond coding. We Forge.

hiro/ray

mirror of https://github.com/vale981/ray synced 2025-03-06 02:21:39 -05:00

Author	SHA1	Message	Date
Jun Gong	54df8bfe42	[RLlib] Try to checkpoint a durable policy name (#27016 )	2022-07-27 00:01:14 -07:00
kourosh hakhamaneshi	5030a4c1d3	[RLlib] Simplify agent collector (#26803 )	2022-07-25 13:17:17 -07:00
Avnish Narayan	41c9ef709a	[RLlib] Using PG when not doing microbatching kills A2C performance. (#26844 )	2022-07-25 15:11:26 +02:00
Jun Gong	0bc560bd54	[RLlib] Make sure we step() after adding init_obs. (#26827 )	2022-07-21 20:43:46 -07:00
Riatre	591cd22be7	Revert "Revert "Bump pytest from 5.4.3 to 7.0.1"" (#26525 ) * Revert "Revert "Bump pytest from 5.4.3 to 7.0.1"" This reverts commit `ab10890e90`. Signed-off-by: Riatre Foo <foo@riat.re> * Fix missing test data files dependency in rllib/BUILD See # 26334 and # 26517 for context. Once this is in, it should be good to roll-forwrad again. Signed-off-by: Riatre Foo <foo@riat.re> * debug: run all tests Signed-off-by: Riatre Foo <foo@riat.re> * Revert "debug: run all tests" This reverts commit 0c5e796b0eb437d64922f66749c61b0412486970. Signed-off-by: Riatre Foo <foo@riat.re> * fix new tests since last rebase Signed-off-by: Riatre Foo <foo@riat.re>	2022-07-18 21:21:19 -07:00
Rohan Potdar	38c9e1d52a	[RLlib]: Fix OPE trainables (#26279 ) Co-authored-by: Kourosh Hakhamaneshi <kourosh@anyscale.com>	2022-07-17 14:25:53 -07:00
kourosh hakhamaneshi	569fe01096	[RLlib] improved unittests for dataset_reader and fixed bugs (#26458 )	2022-07-17 13:38:15 -07:00
Sven Mika	4aea24c8a8	[RLlib] `restart_failed_sub_environments` now works for MA cases and crashes during `reset()`; +more tests and logging; add eval worker sub-env fault tolerance test. (#26276 )	2022-07-15 08:55:14 +02:00
kourosh hakhamaneshi	be6e4c644f	[RLlib] Feature importance evaluation for offline RL (#26412 )	2022-07-11 18:12:50 -07:00
Jun Gong	0c469e490e	[RLlib] Checkpoint and restore connectors. (#26253 )	2022-07-09 01:06:24 -07:00
Avnish Narayan	1243ed62bf	[RLlib] Make Dataset reader default reader and enable CRR to use dataset (#26304 ) Co-authored-by: avnish <avnish@avnishs-MBP.local.meter>	2022-07-08 12:43:35 -07:00
Jun Gong	52bb8e47d4	[RLlib] EnvRunnerV2 and EpisodeV2 that support Connectors. (#25922 )	2022-06-30 08:44:10 +02:00
Avnish Narayan	1f9282a496	[RLlib, Offline] Make the dataset and json readers batchable (#26055 ) Make the dataset and json readers batchable.	2022-06-29 11:52:40 -07:00
kourosh hakhamaneshi	f421730b47	[RLlib] Added `expectation` advantage_type option to CRR. (#26142 )	2022-06-28 15:40:09 +02:00
Sven Mika	762cfbdff1	[RLlib] IMPALA and APPO metrics fixes; remove deprecated `async_parallel_requests` utility. (#26117 )	2022-06-28 15:14:37 +02:00
Jun Gong	8c9cac350d	Fix unit test test_check_env.py and est_check_multi_agent.py. (#25993 )	2022-06-23 22:55:41 -07:00
Sven Mika	96693055bd	[RLlib] More Trainer -> Algorithm renaming cleanups. (#25869 )	2022-06-20 15:54:00 +02:00
Artur Niederfahrenhorst	a322cc5765	[RLlib] IMPALA/APPO multi-agent mix-in-buffer fixes (plus MA learning tests). (#25848 )	2022-06-17 14:10:36 +02:00
Artur Niederfahrenhorst	f34cd2fd8f	[RLlib] Take replay buffer api example out of GPU examples. (#25841 )	2022-06-16 19:12:38 +02:00
Yi Cheng	7b8b0f8e03	Revert "[RLlib] Remove execution plan code no longer used by RLlib. (#25624 )" (#25776 ) This reverts commit `804719876b`.	2022-06-14 13:59:15 -07:00
Avnish Narayan	804719876b	[RLlib] Remove execution plan code no longer used by RLlib. (#25624 )	2022-06-14 10:57:27 +02:00
Kai Fricke	736c7b13c4	[CI] Fix team to `rllib` (from `ml`) for some replay buffer API tests. (#25702 )	2022-06-11 18:05:16 +02:00
Sven Mika	130b7eeaba	[RLlib] `Trainer` to `Algorithm` renaming. (#25539 )	2022-06-11 15:10:39 +02:00
Sven Mika	7c39aa5fac	[RLlib] Trainer.training_iteration -> Trainer.training_step; Iterations vs reportings: Clarification of terms. (#25076 )	2022-06-10 17:09:18 +02:00
Artur Niederfahrenhorst	94d6c212df	[RLlib] Replay Buffer API documentation. (#24683 )	2022-06-10 16:47:51 +02:00
Kai Fricke	aa142eb377	[RLlib; CI] Add `team:rllib` tag for Bazel. (#25589 ) Currently, team:ml spans all ML (Tune, Train, AIR) tests and rllib tests. rllib tests are much more flaky and it would be good to split them up in the flaky test tracker. This PR changes Rllib-tests from team:ml to team:rllib to enable this separation. Co-authored-by: Antoni Baum <antoni.baum@protonmail.com>	2022-06-08 22:25:59 +01:00
Sven Mika	388fb98c79	[RLlib] CRR Tests fixes. (#25586 )	2022-06-08 19:18:55 +02:00
kourosh hakhamaneshi	4cdd508f70	[RLlib] Added CRR implementation. (#25499 )	2022-06-08 11:42:02 +02:00
Jun Gong	9b65d5535d	[RLlib] Introduce basic connectors library. (#25311 )	2022-06-07 19:18:14 +02:00
Rohan Potdar	a9d8da0100	[RLlib]: Doubly Robust Off-Policy Evaluation. (#25056 )	2022-06-07 12:52:19 +02:00
Vince Jankovics	68444cd390	[tune] Custom resources per worker added to default_resource_request (#24463 ) This resolves the `TODO(ekl): add custom resources here once tune supports them` item. Also, related to the discussion [here](https://discuss.ray.io/t/reserve-workers-on-gpu-node-for-trainer-workers-only/5972/5). Co-authored-by: Kai Fricke <kai@anyscale.com>	2022-06-06 22:41:02 +01:00
Jun Gong	644b80c0ef	[RLlib] mark learning and examples tests exclusive. (#25445 )	2022-06-04 09:35:24 -07:00
Sven Mika	b5bc2b93c3	[RLlib] Move all remaining algos into `algorithms` directory. (#25366 )	2022-06-04 07:35:24 +02:00
Yi Cheng	fd0f967d2e	Revert "[RLlib] Move (A/DD)?PPO and IMPALA algos to `algorithms` dir and rename policy and trainer classes. (#25346 )" (#25420 ) This reverts commit `e4ceae19ef`. Reverts #25346 linux://python/ray/tests:test_client_library_integration never fail before this PR. In the CI of the reverted PR, it also fails (https://buildkite.com/ray-project/ray-builders-pr/builds/34079#01812442-c541-4145-af22-2a012655c128). So high likely it's because of this PR. And test output failure seems related as well (https://buildkite.com/ray-project/ray-builders-branch/builds/7923#018125c2-4812-4ead-a42f-7fddb344105b)	2022-06-02 20:38:44 -07:00
Sven Mika	e4ceae19ef	[RLlib] Move (A/DD)?PPO and IMPALA algos to `algorithms` dir and rename policy and trainer classes. (#25346 )	2022-06-02 16:47:05 +02:00
Antoni Baum	045c47f172	[CI] Check test files for `if __name__...` snippet (#25322 ) Bazel operates by simply running the python scripts given to it in `py_test`. If the script doesn't invoke pytest on itself in the `if _name__ == "__main__"` snippet, no tests will be ran, and the script will pass. This has led to several tests (indeed, some are fixed in this PR) that, despite having been written, have never ran in CI. This PR adds a lint check to check all `py_test` sources for the presence of `if _name__ == "__main__"` snippet, and will fail CI if there are any detected without it. This system is only enabled for libraries right now (tune, train, air, rllib), but it could be trivially extended to other modules if approved.	2022-06-02 10:30:00 +01:00
Sven Mika	18c03f8d93	[RLlib] A2C + A3C move to `algorithms` folder and re-name into A2C/A3C (from ...Trainer). (#25314 )	2022-06-01 09:29:16 +02:00
Sven Mika	d95009a3ac	[RLlib] Vectorized envs: Gracefully handle sub-environments failing by restarting them (if configured so). (#24967 )	2022-05-28 10:50:03 +02:00
Sven Mika	163fa81976	[RLlib] Discussion 6060 and 5120: auto-infer different agents' spaces in multi-agent env. (#24649 )	2022-05-27 14:56:24 +02:00
Rohan Potdar	ab81c8e9ca	[RLlib]: Rename `input_evaluation` to `off_policy_estimation_methods`. (#25107 )	2022-05-27 13:14:54 +02:00
Avnish Narayan	eaed256d68	[RLlib] Async parallel execution manager. (#24423 )	2022-05-25 17:54:08 +02:00
Jun Gong	93ff0beb4e	[RLlib] Introduce utils to serialize gym Spaces (and thus ViewRequirements). (#25007 )	2022-05-24 21:12:20 +02:00
Sven Mika	e73c37cc17	[RLlib] MADDPG: Move into main `algorithms` folder and add proper unit and learning tests. (#24579 )	2022-05-24 12:53:53 +02:00
Sven Mika	09886d7ab8	[RLlib] Upgrade gym 0.23 (#24171 )	2022-05-23 08:18:44 +02:00
kourosh hakhamaneshi	3815e52a61	[RLlib] Agents to algos: DQN w/o Apex and R2D2, DDPG/TD3, SAC, SlateQ, QMIX, PG, Bandits (#24896 )	2022-05-19 18:30:42 +02:00
Sven Mika	8f50087908	[RLlib] AlphaZero uses training_iteration API. (#24507 )	2022-05-18 09:58:25 +02:00
Artur Niederfahrenhorst	fb2915d26a	[RLlib] Replay Buffer API and Ape-X. (#24506 )	2022-05-17 13:43:49 +02:00
Sven Mika	0cd7bc4054	[RLlib] Re-establish dashboard performance tests. (#24728 )	2022-05-16 13:13:49 +02:00
Jun Gong	68a9a33386	[RLlib] Retry agents -> algorithms. with proper doc changes this time. (#24797 )	2022-05-16 09:45:32 +02:00
Steven Morad	6321c3a85c	[RLlib] Simple-Q TrainerConfig (#24583 )	2022-05-15 17:24:01 +02:00

1 2 3 4 5 ...

343 commits