Failing pytest summaries for flaky tests that eventually succeed are not always cleaned up properly: https://buildkite.com/ray-project/ray-builders-branch/builds/7292#_
This PR ensures we only print summaries when we have at least one summary file (and not just the header file).
It is sometimes hard to find all failing tests in buildkite output logs - even filtering for "FAILED" is cumbersome as the output can be overloaded. This PR adds a small utility to add a short summary log in a separate output section at the end of the buildkite job.
The only shared directory between the Buildkite host machine and the test docker container is `/tmp/artifacts:/artifact-mount`. Thus, we write the summary file to this directory, and delete it before actually uploading it as an artifact in the `post-commands` hook.
We sometimes end up with stale wheel uploads from previous runs of a Buildkite agent. The result is that commit wheels are being overwritten from old build jobs - effectively breaking the wheel build logic.
Example:
This Agent: https://buildkite.com/organizations/ray-project/agents/4b955117-2f6c-4849-b703-3457daf69f89
- builds wheels (in post-wheels tests) for a35ebc945b
- and then runs both the Ray CPP worker and the Train + Tune tests in 6746e9f
- Usually these two tests shouldn't provide artifacts at all, but they do - these are the wheels from a35ebc945b though! Meaning these are uncleaned leftovers from the first build task.
- See here for proof of artifact upload: https://buildkite.com/ray-project/ray-builders-pr/builds/27622#d11bc514-ebd8-4e0c-a2ce-826b9bad27de
The solution is thus to always clean up the artifacts directory in the worker, i.e. `rm -rf /artifact-mount/*`
This PR adds two of such clean up instructions - once before commands are run and once after artifacts are uploaded. We can probably just do either, but it doesn't hurt to have both.