ray/release/ray_release/alerts/handle.py
Kai Fricke 331b71ea8d
[ci/release] Refactor release test e2e into package (#22351)
Adds a unit-tested and restructured ray_release package for running release tests.

Relevant changes in behavior:

Per default, Buildkite will wait for the wheels of the current commit to be available. Alternatively, users can a) specify a different commit hash, b) a wheels URL (which we will also wait for to be available) or c) specify a branch (or user/branch combination), in which case the latest available wheels will be used (e.g. if master is passed, behavior matches old default behavior).

The main subpackages are:

    Cluster manager: Creates cluster envs/computes, starts cluster, terminates cluster
    Command runner: Runs commands, e.g. as client command or sdk command
    File manager: Uploads/downloads files to/from session
    Reporter: Reports results (e.g. to database)

Much of the code base is unit tested, but there are probably some pieces missing.

Example build (waited for wheels to be built): https://buildkite.com/ray-project/kf-dev/builds/51#_
Wheel build: https://buildkite.com/ray-project/ray-builders-branch/builds/6023
2022-02-16 17:35:02 +00:00

41 lines
1.1 KiB
Python

from ray_release.config import Test
from ray_release.exception import ReleaseTestConfigError, ResultsAlert
from ray_release.logger import logger
from ray_release.result import Result
from ray_release.alerts import (
default,
long_running_tests,
rllib_tests,
tune_tests,
xgboost_tests,
)
result_to_handle_map = {
"default": default.handle_result,
"long_running_tests": long_running_tests.handle_result,
"rllib_tests": rllib_tests.handle_result,
"tune_tests": tune_tests.handle_result,
"xgboost_tests": xgboost_tests.handle_result,
}
def handle_result(test: Test, result: Result):
alert_suite = test.get("alert", "default")
logger.info(
f"Checking results for test {test['name']} using alerting suite "
f"{alert_suite}"
)
if alert_suite not in result_to_handle_map:
raise ReleaseTestConfigError(f"Alert suite {alert_suite} not found.")
handler = result_to_handle_map[alert_suite]
error = handler(test, result)
if error:
raise ResultsAlert(error)
logger.info("No alerts have been raised - test passed successfully!")