It will be easier to develop if we could use a tool to organize / sort imports and not have to move them around by hand.
This PR shows how we could do this with isort (black doesn't quite do this per https://github.com/psf/black/issues/333)
After this PR lands everyone will need to update their formatter to include isort if they don't have it already, i.e.
pip install -r ./python/requirements_linters.txt
All future file changes will go through isort and may introduce a slightly larger PR the first time as it will clean up the imports.
The plan is to land this PR and also clean up the rest of the code in parallel by using this PR to format the codebase (so people won't get surprised by the formatter if the file hasn't been touched yet)
Co-authored-by: Clarence Ng <clarence@anyscale.com>
Currently, team:ml spans all ML (Tune, Train, AIR) tests and rllib tests. rllib tests are much more flaky and it would be good to split them up in the flaky test tracker. This PR changes Rllib-tests from team:ml to team:rllib to enable this separation.
Co-authored-by: Antoni Baum <antoni.baum@protonmail.com>
The package "ml" should be renamed to "air".
Main question: Keep a `ml.py` with `from ray.air import *` for some level of backwards compatibility?
I'd go for no to force people to use the new structure.
Bazel operates by simply running the python scripts given to it in `py_test`. If the script doesn't invoke pytest on itself in the `if _name__ == "__main__"` snippet, no tests will be ran, and the script will pass. This has led to several tests (indeed, some are fixed in this PR) that, despite having been written, have never ran in CI. This PR adds a lint check to check all `py_test` sources for the presence of `if _name__ == "__main__"` snippet, and will fail CI if there are any detected without it. This system is only enabled for libraries right now (tune, train, air, rllib), but it could be trivially extended to other modules if approved.
In this PR we simulate the case where serve can continue to function even when GCS is down and the reconfig continue to work once GCS is back.
To make it close to the real-world case, the docker is used for isolation:
It starts a head node (0 cpus) and a worker node
It tried the basic function and make sure it's working
It kills GCS and make sure everything is working.
It starts GCS and make sure reconfig continues to work.
This is the basic cases for serve HA. We'll add more once we get better integrations.
Currently the release test runner prefers the first successfully version of a cluster env, instead of the last version. But sometimes a cluster env may build successfully on Anyscale but cannot launch cluster successfully (e.g. version 2 here) or new dependencies need to be installed, so a new version needs to be built. The existing logic always picks up the 1st successful build and cannot pick up the new cluster env version.
Although this is an edge case (tweaking cluster env versions, with the same Ray wheel or cluster env name), I believe it is possible for others to run into it.
Also, avoid running most of the CI tests for changes under release/ray_release/.
Why are these changes needed?
Add API stability annotations for datasource classes, and add a linter to check all data classes have appropriate annotations.
Currently, we are not running doc notebooks in CI due to a bazel misconfiguration - we are using `glob` in a top level package in order to get the paths for the notebooks, but those are contained inside subpackages, which glob purposefully ignores. Therefore, the lists of notebooks to run are empty. This PR fixes that by:
* Running the `py_test_run_all_notebooks` macro inside the relevant subpackages
* Editing the `test_myst_doc.py` script to allow for recursive search for the target file, allowing to deal with mismatches between `name` and `data` arguments in `py_test_run_all_notebooks`
* Setting the `allow_empty=False` flag inside `glob` calls in our macros to ensure that this oversight is caught early
* Enabling detection of changes in doc folder for `*.ipynb` and `BUILD` files
This PR also adds a GPU runner for doc tests, allowing one of our examples to pass - and setting the infra for more to come. Finally, a misconfigured path for one set of doc tests is also fixed.
The `black` version string differs based on installation method. On my (m1) laptop:
```
$ black --version
black, version 21.7b0
```
On @simon-mo's (intel) and @shrekris-anyscale's (m1) laptops:
```
$ black --version
black, 21.12b0 (compiled: no)
```
This adds a conditional in `ci/lint/format.sh` to handle both.
This PR
- adds an example on how to run Ray Train and log results to weights & biases
- adds functionality to the W&B plugin to store checkpoints
- fixes a bug introduced in #24017
- Adds a CI utility script to setup credentials
- Adds a CI utility script to remove test state from external services cc @simon-mo
It is sometimes hard to find all failing tests in buildkite output logs - even filtering for "FAILED" is cumbersome as the output can be overloaded. This PR adds a small utility to add a short summary log in a separate output section at the end of the buildkite job.
The only shared directory between the Buildkite host machine and the test docker container is `/tmp/artifacts:/artifact-mount`. Thus, we write the summary file to this directory, and delete it before actually uploading it as an artifact in the `post-commands` hook.
Ray SGD v1 has been denoted as a deprecated API for a while. This PR fully deprecates Ray SGD v1. An error will be raised if ray.util.sgd package is attempted to be imported.
Closes#16435
Currently all jobs that build wheels put them into the artifacts directory and upload them. This leads to the wheels being overwritten on S3 multiple times. This is not a huge problem as ingress is free, but in order to have a single point of reference, it might be beneficial to limit the wheels uploading to a single Buildkite job. Recently, this has led to interference with stale artifact directories.
The downside here is that if the "Wheels & Jars" build fails randomly, the wheels will not be available on S3 - previously they've been also uploaded by several other jobs.
Creates a zip of session_latest dir with test name and timestamp upon python test failure. Writes to dir specified by env var `RAY_TEST_FAILURE_LOGS_DIR`. Noop if env var does not exist.
Downstream consumer (e.g. CI) can upload all created artifacts in this dir. Thereby, PR submitters can more easily debug their CI failures, especially if they can't repro locally.
Limitations:
- a conftest.py file importing the main ray conftest.py needs to be present in same dir as test. This presents a challenge for e.g. dashboard tests which are highly scattered
The recursive grep in the banned words check can get really messy when running locally depending on each person's directory structure or where the format script is being called from.
Separates the banned words check as a separate script so that it's not called by default in ./format.sh. Also adds this to the documentation
Clean up the ci/ directory. This means getting rid of the travis/ path completely and moving the files into sensible subdirectories.
Details:
- Moves everything under ci/travis into subdirectories, e.g. ci/build, ci/lint, etc.
- Minor adjustments to some scripts (variable renames)
- Removes the outdated (unused) asan tests
While working on https://github.com/ray-project/ray/pull/20577 we noticed `requests` module is not blacked listed in minimal install test, but not sure why. As a result we missed coverage on P0 issue like https://github.com/ray-project/ray/issues/20574.
This is an attempt to see what would happen if we blacklist it and if we're able to get any signals from CI.
Co-authored-by: Jiao Dong <jiaodong@anyscale.com>
Co-authored-by: Kai Fricke <kai@anyscale.com>
Why are these changes needed?
Linkcheck is inherently flaky, so separate it from the normal LINT build which is never flaky. This also separates the verbose linkcheck logs, making it easier to read the LINT output.