hiro/ray - Forgejo: Beyond coding. We Forge.

hiro/ray

mirror of https://github.com/vale981/ray synced 2025-03-06 10:31:39 -05:00

Author	SHA1	Message	Date
Kai Fricke	6c5229295e	[ci/release] Support running tests with different python versions (#24843 ) OSS release tests currently run with hardcoded Python 3.7 base. In the future we will want to run tests on different python versions. This PR adds support for a new `python` field in the test configuration. The python field will determine both the base image used in the Buildkite runner docker container (for Ray client compatibility) and the base image for the Anyscale cluster environments. Note that in Buildkite, we will still only wait for the python 3.7 base image before kicking off tests. That is acceptable, as we can assume that most wheels finish in a similar time, so even if we wait for the 3.7 image and kick off a 3.8 test, that runner will wait maybe for 5-10 more minutes.	2022-05-17 17:03:12 +01:00
Jiajun Yao	6f14b6a9c3	[Release Test] Add smoke_test field to release test report (#24749 ) Distinguish smoke test and normal test.	2022-05-16 10:38:54 +01:00
Kai Fricke	e1eec5507a	[ci/release] Fix ray version from init test (#24510 ) This release package unit test fails on release branches. Instead of checking for a hard-coded version number, we should just require the value to be non-empty. See e.g. https://buildkite.com/ray-project/ray-builders-pr/builds/31295#b6c6c952-ce34-4521-9342-429e92560dd3	2022-05-05 16:05:23 +01:00
Kai Fricke	8a578c191f	[ci/release] Re-install anyscale package after local env setup (#24373 ) The local environment setup of release tests (in client tests) can sometimes update dependencies of the `anyscale` package to an unsupported version. By re-installing the `anyscale` package after local env setup, we make sure that we can connect to the cluster. Note that this may lead to incompatibilities of the test script, however.	2022-05-01 16:51:55 +01:00
mwtian	02fda97c86	[CI] Re-balance concurrency groups to allow more quota for `large` tests (#24344 ) Currently nightly tests are unable to finish in a day because of concurrency group limit on `large` tests. This is an attempt to adjust the limits so buildkite can run / finish more tests. I will observe which tests fall into the `enormous` group and adjust the test resource / concurrency group limits again.	2022-04-29 22:26:16 +01:00
Kai Fricke	f3857b7aa1	[ci/release] Fix concurrency group calculation for smoke tests (#24269 ) Currently concurrency groups are always calculated based on the full test cluster compute. Instead, smoke tests should use the smoke test cluster compute.	2022-04-27 22:13:25 +01:00
Kai Fricke	6e37a48632	[ci/release] Allow for preferring smoke tests when filtering (#23887 ) What: Adds a setting "prefer_smoke_tests" to the Buildkite settings. With this, user can specify to kick off smoke tests, if available. Why: The filtering interface of the release testing dialog is a bit complicated at the moment - in order to kick off smoke tests, users have to know with which frequency they are configured to run. Instead users should usually just filter the tests they want to run (using frequency ANY) and optionally specify to run smoke tests, if available.	2022-04-14 06:12:27 +01:00
Kai Fricke	73d1610e69	[ci/release] Fix pipeline build for empty PR repo (#23775 ) What: If BUILDKITE_PULL_REQUEST_REPO is empty string, default to DEFAULT_REPO Why: BUILDKITE_PULL_REQUEST_REPO is set to an empty string per default, thus we're currently not detecting the buildkite repo correctly in branched builds.	2022-04-07 09:29:48 -07:00
Kai Fricke	7b86a05efd	[ci/release] Parse PR github repos correctly (#23757 ) What: Correctly infer github repo from PRs iin Buildkite environments Why: For PRs, we need to checkout the correct github repo and branch so we can kick off release tests directly from PRs. Test run (from this PR!): https://buildkite.com/ray-project/release-tests-pr/builds/20#7f5a6526-0040-4896-b23a-f4896c75973d	2022-04-06 17:34:20 -07:00
Jiajun Yao	2959294f02	[CI] Filter release tests by attr regex (#23485 ) Support filtering tests by test attr regex filters. Multiple filters can be specified with one line for each filter. The format is attr:regex (e.g. team:serve)	2022-03-30 09:41:18 -07:00
Kai Fricke	02644ab4d8	[ci/release] Retry cluster env build on failure (#23378 ) Failed cluster env builds should be retried.	2022-03-22 09:45:22 +00:00
Kai Fricke	ca5354ffb1	[ci/release] Fix test_wheels (#23329 )	2022-03-18 14:39:36 +00:00
Kai Fricke	3cf8116df2	[ci/release] Re-enable commit sanity check (#23327 ) Commit sanity checks are currently seemingly disabled. This PR re-enables them by parsing wheel URLs.	2022-03-18 12:57:41 +00:00
Kai Fricke	da140a80e9	[ci/release] Legacy field should be optional (#23326 ) #22749 broke release unit tests by not providing a legacy key - that key should be optional because we will b dealing with non-legacy tests soon. Additionally, for some reason the unit tests pass on buildkite while they fail locally and in the release test pipeline. I'm investigating this now...	2022-03-18 11:34:05 +00:00
Kai Fricke	15aeb33e50	[ci/release] Support PR wheels (#23084 ) This PR adds support to find wheels for PRs to run OSS release tests on, i.e. --ray-wheels user:branch to work.	2022-03-14 17:24:13 +00:00
Kai Fricke	956ad95d67	[ci/release] Fix release test config (#23122 ) Currently the test is failing due to an invalid config (merged before validation was properly enforced).	2022-03-13 19:48:34 +00:00
Kai Fricke	c7303f538c	[ci/release] Validate smoke test fields, enforce frequency (#23075 ) Of all smoke test arguments, frequency is the only required one, so we should check for it. Additionally, not all fields should be able to be overwritten (e.g. legacy or name), so we enforce this as well.	2022-03-13 18:48:03 +00:00
Kai Fricke	04ea180dfb	[ci/release] Add "tiny" concurrency group, change limits (#23065 ) E.g. long running tests run on small clusters (often 8 CPUs) but block other jobs for a long time. We should thus add more granularity to the concurrency groups. Additionally, limits have been slightly adjusted to make more sense (e.g. 8 GPUs are now small-gpu, 9+ GPUs large-gpu, instead of 7 for small-gpu and 8 for large-gpu).	2022-03-11 10:19:38 -08:00
Kai Fricke	cac9d30909	[ci/release] Add schema validation for release test config (#22919 ) To avoid breakage like in #22905, this PR adds schema validation to the release test package. In a follow-up PR, we'll likely switch this to use pydantic instead.	2022-03-09 09:50:51 +00:00
Kai Fricke	7425fa6212	[ci/release] Add support for concurrency groups (#22728 ) This PR adds concurrency groups to Buildkite release test runs with new release test package. Five concurrency groups are defined (large-gpu, small-gpu, large, medium, small). If not specified manually, concurrency groups are inferred from used cluster resources. Example pipeline: https://buildkite.com/ray-project/release-tests-branch/builds/55#09109eac-d22e-43bc-889e-078cfb037373 (click on Artifacts --> pipeline.json)	2022-03-02 16:35:54 +01:00
Kai Fricke	3695408a85	[release] Fix special cases in release test package (e.g. smoke test) (#22442 ) Fixing special cases (e.g. smoke tests, long running tests) in the release test package infrastructure. Prepare migration of Tune and XGBoost tests.	2022-02-28 21:05:01 +01:00
Kai Fricke	331b71ea8d	[ci/release] Refactor release test e2e into package (#22351 ) Adds a unit-tested and restructured ray_release package for running release tests. Relevant changes in behavior: Per default, Buildkite will wait for the wheels of the current commit to be available. Alternatively, users can a) specify a different commit hash, b) a wheels URL (which we will also wait for to be available) or c) specify a branch (or user/branch combination), in which case the latest available wheels will be used (e.g. if master is passed, behavior matches old default behavior). The main subpackages are: Cluster manager: Creates cluster envs/computes, starts cluster, terminates cluster Command runner: Runs commands, e.g. as client command or sdk command File manager: Uploads/downloads files to/from session Reporter: Reports results (e.g. to database) Much of the code base is unit tested, but there are probably some pieces missing. Example build (waited for wheels to be built): https://buildkite.com/ray-project/kf-dev/builds/51#_ Wheel build: https://buildkite.com/ray-project/ray-builders-branch/builds/6023	2022-02-16 17:35:02 +00:00

22 commits