Commit graph

10 commits

Author SHA1 Message Date
Kai Fricke
7091a32fe1
[ci/release] Support running tests on staging (#25889)
This adds "environments" to the release package that can be used to configure some environment variables. These variables will be loaded either by an `--env` argument or a `env` definition in the test definition and can be used to e.g. run release tests on staging.
2022-06-28 10:14:01 -07:00
Kai Fricke
6c5229295e
[ci/release] Support running tests with different python versions (#24843)
OSS release tests currently run with hardcoded Python 3.7 base. In the future we will want to run tests on different python versions. 
This PR adds support for a new `python` field in the test configuration. The python field will determine both the base image used in the Buildkite runner docker container (for Ray client compatibility) and the base image for the Anyscale cluster environments. 

Note that in Buildkite, we will still only wait for the python 3.7 base image before kicking off tests. That is acceptable, as we can assume that most wheels finish in a similar time, so even if we wait for the 3.7 image and kick off a 3.8 test, that runner will wait maybe for 5-10 more minutes.
2022-05-17 17:03:12 +01:00
SangBin Cho
295b4436b3
[Nightly tests] Increase wait for nodes timeout (#24457)
Although there's enough quota, it is possible the AWS doesn't have enough capacity to start up new nodes. According to @allenyin55, the current wait for node timeout is too short. This PR increases the timeout to 3000 seconds (50 minutes) from 600 seconds. Let's see if this can resolve the issue. If it makes things worse, I will revert it quickly (I will closely monitor the infra failure rate)
2022-05-04 19:42:21 -07:00
Kai Fricke
6e37a48632
[ci/release] Allow for preferring smoke tests when filtering (#23887)
What: Adds a setting "prefer_smoke_tests" to the Buildkite settings. With this, user can specify to kick off smoke tests, if available.

Why: The filtering interface of the release testing dialog is a bit complicated at the moment - in order to kick off smoke tests, users have to know with which frequency they are configured to run. Instead users should usually just filter the tests they want to run (using frequency ANY) and optionally specify to run smoke tests, if available.
2022-04-14 06:12:27 +01:00
Kai Fricke
3cf8116df2
[ci/release] Re-enable commit sanity check (#23327)
Commit sanity checks are currently seemingly disabled. This PR re-enables them by parsing wheel URLs.
2022-03-18 12:57:41 +00:00
Kai Fricke
a8bed94ed6
[ci/release] Always use full cluster address (#23067)
Not using the full cluster address is deprecated and breaks Job usage for uploads/downloads: https://buildkite.com/ray-project/release-tests-branch/builds/135#2a03e47b-6a9a-42ff-9346-905725eb8d09
2022-03-11 16:31:21 +00:00
SangBin Cho
ebac18d163
[Nightly test] Support Job based file manager + runner (#22860)
This PR supports the job-based file manager and runner. It will be the backbone of k8s migration.

The PR handles edge cases that originally existed in the old e2e.py job-based runners.
2022-03-10 15:03:50 -08:00
Kai Fricke
ac654dbb9d
[ci/release] Fix schema validation for single tests / add stable field (#22947)
This currently leads to failing builds for schema validation errors after #22901 was merged (the stable column was incorrectly not added to the schema before).
2022-03-09 15:22:49 +00:00
Kai Fricke
cac9d30909
[ci/release] Add schema validation for release test config (#22919)
To avoid breakage like in #22905, this PR adds schema validation to the release test package.
In a follow-up PR, we'll likely switch this to use pydantic instead.
2022-03-09 09:50:51 +00:00
Kai Fricke
331b71ea8d
[ci/release] Refactor release test e2e into package (#22351)
Adds a unit-tested and restructured ray_release package for running release tests.

Relevant changes in behavior:

Per default, Buildkite will wait for the wheels of the current commit to be available. Alternatively, users can a) specify a different commit hash, b) a wheels URL (which we will also wait for to be available) or c) specify a branch (or user/branch combination), in which case the latest available wheels will be used (e.g. if master is passed, behavior matches old default behavior).

The main subpackages are:

    Cluster manager: Creates cluster envs/computes, starts cluster, terminates cluster
    Command runner: Runs commands, e.g. as client command or sdk command
    File manager: Uploads/downloads files to/from session
    Reporter: Reports results (e.g. to database)

Much of the code base is unit tested, but there are probably some pieces missing.

Example build (waited for wheels to be built): https://buildkite.com/ray-project/kf-dev/builds/51#_
Wheel build: https://buildkite.com/ray-project/ray-builders-branch/builds/6023
2022-02-16 17:35:02 +00:00