ray/release
Kai Fricke e8abffb017
[tune/release] Improve Tune cloud release tests for durable storage (#23277)
This PR addresses recent failures in the tune cloud tests.

In particular, this PR changes the following:

    The trial runner will now wait for potential previous syncs to finish before syncing once more if force=True is supplied. This is to make sure that the final experiment checkpoints exist in the most recent version on remote storage. This likely fixes some flakiness in the tests.
    We switched to new cloud buckets that don't interfere with other tests (and are less likely to be garbage collected)
    We're now using dated subdirectories in the cloud buckets so that we don't interfere if two tests are run in parallel. Objects are cleaned up afterwards. The buckets are configured to remove objects after 30 days.
    Lastly, we fix an issue in the cloud tests where the RELEASE_TEST_OUTPUT file was unavailable when run in Ray client mode (as e.g. in kubernetes).

Local release test runs succeeded.

https://buildkite.com/ray-project/release-tests-branch/builds/189
https://buildkite.com/ray-project/release-tests-branch/builds/191
2022-03-30 09:28:33 -07:00
..
benchmarks Add perf metrics for test_many_tasks.py (#23318) 2022-03-22 16:16:42 -07:00
golden_notebook_tests [ci/release] Migrate golden notebook tests (#22949) 2022-03-13 21:39:41 +00:00
horovod_tests [ci/release] Remove old OSS release test infrastructure (#23134) 2022-03-14 15:10:52 +00:00
kubernetes_manual_tests [test][k8s] Restore kubernetes test directory, adds some info (#18982) 2021-10-01 11:23:22 +01:00
lightgbm_tests [ci/release] Remove old OSS release test infrastructure (#23134) 2022-03-14 15:10:52 +00:00
long_running_distributed_tests [RLlib] Pin Gym Everywhere and turn off gpu for recsim tests (#23452) 2022-03-24 09:17:30 +01:00
long_running_tests [release tests] Pin gym everywhere (#23349) 2022-03-19 02:52:54 -07:00
microbenchmark [ci/release] Remove old OSS release test infrastructure (#23134) 2022-03-14 15:10:52 +00:00
ml_user_tests [ci/release] Remove old OSS release test infrastructure (#23134) 2022-03-14 15:10:52 +00:00
nightly_tests [nighly-test] try out spot instances for chaos test #23507 2022-03-27 20:10:21 -07:00
ray_release [tune/release] Improve Tune cloud release tests for durable storage (#23277) 2022-03-30 09:28:33 -07:00
release_logs [Release 1.11.0] Release logs for 1.11.0rc1 (#22443) 2022-02-16 17:03:49 -08:00
rllib_tests [RLlib] Simple-Q uses training iteration fn (instead of execution_plan); ReplayBuffer API for Simple-Q (#22842) 2022-03-29 14:44:40 +02:00
runtime_env_tests [ci/release] Remove old OSS release test infrastructure (#23134) 2022-03-14 15:10:52 +00:00
serve_tests [serve] [release] Disable smoke test by default (#23334) 2022-03-18 18:40:48 -05:00
sgd_tests/sgd_gpu [ci/release] Remove old OSS release test infrastructure (#23134) 2022-03-14 15:10:52 +00:00
train_tests/horovod [Train] Fix multi node horovod bug (#22564) 2022-03-22 16:22:53 -07:00
tune_tests [tune/release] Improve Tune cloud release tests for durable storage (#23277) 2022-03-30 09:28:33 -07:00
util [ci/release] Remove old OSS release test infrastructure (#23134) 2022-03-14 15:10:52 +00:00
xgboost_tests [ci/release] Remove old OSS release test infrastructure (#23134) 2022-03-14 15:10:52 +00:00
__init__.py [release] move release testing end to end script to main ray repo (#17070) 2021-07-14 12:39:07 -07:00
BUILD [serve][release tests] Add smoke test to CI for remaining tests (#22962) 2022-03-09 23:36:32 -06:00
README.md [Release] Remove release process doc (#19312) 2021-10-18 11:24:03 -07:00
release_tests.yaml [tune/release] Improve Tune cloud release tests for durable storage (#23277) 2022-03-30 09:28:33 -07:00
requirements.txt [ci/release] Refactor release test e2e into package (#22351) 2022-02-16 17:35:02 +00:00
requirements_buildkite.txt [ci/release] Refactor release test e2e into package (#22351) 2022-02-16 17:35:02 +00:00
run_release_test.sh [ci/release] Disable infra retries for now (#23132) 2022-03-14 11:51:11 +00:00
setup.py [ci/release] Refactor release test e2e into package (#22351) 2022-02-16 17:35:02 +00:00

Release Tests

While the exact release process relies on Anyscale internal tooling, the tests we run during the releases are located at https://github.com/ray-project/ray/blob/master/release/.buildkite/build_pipeline.py