Commit graph

6 commits

Author SHA1 Message Date
Kai Fricke
724377163f
[ci/release] Unstable tests should only soft fail the build (#23403)
This will leave the tests green if the test is failing but marked as unstable.
2022-03-23 09:38:56 +00:00
Kai Fricke
e510d81c71
[ci/release] Save test config and results as artifacts (#23278)
It is good to have these information readily available when checking test results, as it will reveal both the original configuration (that could change over time) as well as the achieved results.
Also gets rid of the unneeded old alerts directory.

https://buildkite.com/ray-project/release-tests-branch/builds/190#ef531787-412c-40ec-81e6-beb495830c60
2022-03-18 09:26:42 +00:00
Kai Fricke
d93fa95dd5
[ci/release] Only report results for scheduled builds (#23135)
Currently, all buildkite runs report per default. Instead, we only want to report when running scheduled builds or when specifically overriding this behavior.
2022-03-14 15:10:16 +00:00
Kai Fricke
a8bed94ed6
[ci/release] Always use full cluster address (#23067)
Not using the full cluster address is deprecated and breaks Job usage for uploads/downloads: https://buildkite.com/ray-project/release-tests-branch/builds/135#2a03e47b-6a9a-42ff-9346-905725eb8d09
2022-03-11 16:31:21 +00:00
Kai Fricke
7425fa6212
[ci/release] Add support for concurrency groups (#22728)
This PR adds concurrency groups to Buildkite release test runs with new release test package. Five concurrency groups are defined (large-gpu, small-gpu, large, medium, small). If not specified manually, concurrency groups are inferred from used cluster resources.

Example pipeline: https://buildkite.com/ray-project/release-tests-branch/builds/55#09109eac-d22e-43bc-889e-078cfb037373 (click on Artifacts --> pipeline.json)
2022-03-02 16:35:54 +01:00
Kai Fricke
331b71ea8d
[ci/release] Refactor release test e2e into package (#22351)
Adds a unit-tested and restructured ray_release package for running release tests.

Relevant changes in behavior:

Per default, Buildkite will wait for the wheels of the current commit to be available. Alternatively, users can a) specify a different commit hash, b) a wheels URL (which we will also wait for to be available) or c) specify a branch (or user/branch combination), in which case the latest available wheels will be used (e.g. if master is passed, behavior matches old default behavior).

The main subpackages are:

    Cluster manager: Creates cluster envs/computes, starts cluster, terminates cluster
    Command runner: Runs commands, e.g. as client command or sdk command
    File manager: Uploads/downloads files to/from session
    Reporter: Reports results (e.g. to database)

Much of the code base is unit tested, but there are probably some pieces missing.

Example build (waited for wheels to be built): https://buildkite.com/ray-project/kf-dev/builds/51#_
Wheel build: https://buildkite.com/ray-project/ray-builders-branch/builds/6023
2022-02-16 17:35:02 +00:00