ray/release/ray_release
SangBin Cho 2c2d96eeb1
[Nightly tests] Improve k8s testing (#23108)
This PR improves broken k8s tests.

Use exponential backoff on the unstable HTTP path (getting job status sometimes has broken connection from the server. I couldn't really find the relevant logs to figure out why this is happening, unfortunately).
Fix benchmark tests resource leak check. The existing one was broken because the job submission uses 0.001 node IP resource, which means the cluster_resources can never be the same as available resources. I fixed the issue by not checking node IP resources
K8s infra doesn't support instances < 8 CPUs. I used m5.2xlarge instead of xlarge. It will increase the cost a bit, but it wouldn't be very big.
2022-03-14 03:49:15 -07:00
..
alerts [ci/release] Refactor release test e2e into package (#22351) 2022-02-16 17:35:02 +00:00
buildkite [ci/release] Add "tiny" concurrency group, change limits (#23065) 2022-03-11 10:19:38 -08:00
cluster_manager [ci/release] Always use full cluster address (#23067) 2022-03-11 16:31:21 +00:00
command_runner [ci/release] Migrate golden notebook tests (#22949) 2022-03-13 21:39:41 +00:00
file_manager [Nightly test] Support Job based file manager + runner (#22860) 2022-03-10 15:03:50 -08:00
reporter [Release Test] Change release test db reporter report_time to report_timestamp_ms (#22844) 2022-03-07 04:54:19 -08:00
scripts [Release Test] Send release test result to db pipeline for new test infra (#22813) 2022-03-05 07:34:40 +09:00
tests [ci/release] Fix release test config (#23122) 2022-03-13 19:48:34 +00:00
__init__.py [ci/release] Refactor release test e2e into package (#22351) 2022-02-16 17:35:02 +00:00
anyscale_util.py [ci/release] Refactor release test e2e into package (#22351) 2022-02-16 17:35:02 +00:00
aws.py [ci/release] Refactor release test e2e into package (#22351) 2022-02-16 17:35:02 +00:00
config.py [ci/release] Always use full cluster address (#23067) 2022-03-11 16:31:21 +00:00
exception.py [ci/release] Refactor release test e2e into package (#22351) 2022-02-16 17:35:02 +00:00
glue.py [Nightly test] Support Job based file manager + runner (#22860) 2022-03-10 15:03:50 -08:00
job_manager.py [Nightly tests] Improve k8s testing (#23108) 2022-03-14 03:49:15 -07:00
logger.py [ci/release] Refactor release test e2e into package (#22351) 2022-02-16 17:35:02 +00:00
result.py [ci/release] Refactor release test e2e into package (#22351) 2022-02-16 17:35:02 +00:00
schema.json [ci/release] Validate smoke test fields, enforce frequency (#23075) 2022-03-13 18:48:03 +00:00
util.py [ci/release] Migrate golden notebook tests (#22949) 2022-03-13 21:39:41 +00:00
wheels.py [ci/release] Refactor release test e2e into package (#22351) 2022-02-16 17:35:02 +00:00