ray/release
Tao Wang a051e693c1
[Test]Add a time check for task benchmark (#23170)
In test_many_tasks.py case, we usually found the case failing and found the reason.

We sleep for sleep_time seconds to wait all tasks to be finished, but the computation of actual sleep time is done by 0.1 * #rounds, where 0.1 is the sleep time every round.
It looks perfect but one factor was missed, and that's the computation time elapsed. In this case, it is the time consumed by

            cur_cpus = ray.available_resources().get("CPU", 0)
            min_cpus_available = min(min_cpus_available, cur_cpus)
especially the ray.available_resources() took a quite time when the cluster is large. (in our case it took beyond 1s with 1500 nodes).

The situation we thought it would be:

for _ in range(sleep_time / 0.1):
    sleep(0.1)
The actual situation happens:

for _ in range(sleep_time / 0.1):
    do_something(); # it costs time, sometimes pretty much
    sleep(0.1)
We don't know why ray.available_resources() is slow and if it's logical, but we can add a time checker to make the sleep time precise.
2022-04-11 06:27:04 -07:00
..
benchmarks [Test]Add a time check for task benchmark (#23170) 2022-04-11 06:27:04 -07:00
golden_notebook_tests [ci/release] Migrate golden notebook tests (#22949) 2022-03-13 21:39:41 +00:00
horovod_tests [ci/release] Remove old OSS release test infrastructure (#23134) 2022-03-14 15:10:52 +00:00
jobs_tests Add basic jobs release test with Tune script (#23474) 2022-04-05 13:31:11 -05:00
kubernetes_manual_tests [test][k8s] Restore kubernetes test directory, adds some info (#18982) 2021-10-01 11:23:22 +01:00
lightgbm_tests [ci/release] Remove old OSS release test infrastructure (#23134) 2022-03-14 15:10:52 +00:00
long_running_distributed_tests [RLlib] Pin Gym Everywhere and turn off gpu for recsim tests (#23452) 2022-03-24 09:17:30 +01:00
long_running_tests [release tests] Pin gym everywhere (#23349) 2022-03-19 02:52:54 -07:00
microbenchmark [ci/release] Remove old OSS release test infrastructure (#23134) 2022-03-14 15:10:52 +00:00
ml_user_tests [ci/release] Remove old OSS release test infrastructure (#23134) 2022-03-14 15:10:52 +00:00
nightly_tests Revert "[core][tests] Add nightly test for datasets random_shuffle and sort (#23784)" (#23805) 2022-04-08 13:18:13 -07:00
ray_release [spelling] Add linter rule for mis-capitalizations of RLLib -> RLlib (#23817) 2022-04-10 16:12:53 -07:00
release_logs [Release 1.12.0] Add release logs for 1.12.0rc1 (#23508) 2022-04-07 11:23:04 -07:00
rllib_tests [RLlib; testing] Move num_workers to RLlib config (#23750) 2022-04-06 20:06:48 +02:00
runtime_env_tests [ci/release] Remove old OSS release test infrastructure (#23134) 2022-03-14 15:10:52 +00:00
serve_tests [serve] [release] Disable smoke test by default (#23334) 2022-03-18 18:40:48 -05:00
sgd_tests/sgd_gpu [ci/release] Remove old OSS release test infrastructure (#23134) 2022-03-14 15:10:52 +00:00
train_tests/horovod [Train] Fix multi node horovod bug (#22564) 2022-03-22 16:22:53 -07:00
tune_tests [spelling] Add linter rule for mis-capitalizations of RLLib -> RLlib (#23817) 2022-04-10 16:12:53 -07:00
util [ci/release] Remove old OSS release test infrastructure (#23134) 2022-03-14 15:10:52 +00:00
xgboost_tests [ci/release] Remove old OSS release test infrastructure (#23134) 2022-03-14 15:10:52 +00:00
__init__.py [release] move release testing end to end script to main ray repo (#17070) 2021-07-14 12:39:07 -07:00
BUILD [serve][release tests] Add smoke test to CI for remaining tests (#22962) 2022-03-09 23:36:32 -06:00
README.md [Release] Remove release process doc (#19312) 2021-10-18 11:24:03 -07:00
release_tests.yaml [spelling] Add linter rule for mis-capitalizations of RLLib -> RLlib (#23817) 2022-04-10 16:12:53 -07:00
requirements.txt [ci/release] Refactor release test e2e into package (#22351) 2022-02-16 17:35:02 +00:00
requirements_buildkite.txt [ci/release] Refactor release test e2e into package (#22351) 2022-02-16 17:35:02 +00:00
run_release_test.sh [ci/release] Disable infra retries for now (#23132) 2022-03-14 11:51:11 +00:00
setup.py [ci/release] Refactor release test e2e into package (#22351) 2022-02-16 17:35:02 +00:00

Release Tests

While the exact release process relies on Anyscale internal tooling, the tests we run during the releases are located at https://github.com/ray-project/ray/blob/master/release/.buildkite/build_pipeline.py