hiro/ray - Forgejo: Beyond coding. We Forge.

hiro/ray

mirror of https://github.com/vale981/ray synced 2025-03-06 10:31:39 -05:00

Author	SHA1	Message	Date
Antoni Baum	045c47f172	[CI] Check test files for `if __name__...` snippet (#25322 ) Bazel operates by simply running the python scripts given to it in `py_test`. If the script doesn't invoke pytest on itself in the `if _name__ == "__main__"` snippet, no tests will be ran, and the script will pass. This has led to several tests (indeed, some are fixed in this PR) that, despite having been written, have never ran in CI. This PR adds a lint check to check all `py_test` sources for the presence of `if _name__ == "__main__"` snippet, and will fail CI if there are any detected without it. This system is only enabled for libraries right now (tune, train, air, rllib), but it could be trivially extended to other modules if approved.	2022-06-02 10:30:00 +01:00
Eric Liang	6fe8f7e16b	fix lint (#25393 )	2022-06-01 22:35:30 -07:00
Yi Cheng	cb1f08a3c1	[core] Basic end-2-end multi-node tests for GCS HA in CI. (#25114 ) In this PR we simulate the case where serve can continue to function even when GCS is down and the reconfig continue to work once GCS is back. To make it close to the real-world case, the docker is used for isolation: It starts a head node (0 cpus) and a worker node It tried the basic function and make sure it's working It kills GCS and make sure everything is working. It starts GCS and make sure reconfig continues to work. This is the basic cases for serve HA. We'll add more once we get better integrations.	2022-06-02 02:41:38 +00:00
Simon Mo	b9874f5bd9	[CI][Hotfix] Fix macOS kickoff job (#25377 )	2022-06-01 15:04:17 -07:00
Eric Liang	905258dbc1	Clean up docstyle in python modules and add LINT rule (#25272 )	2022-06-01 11:27:54 -07:00
Yi Cheng	8c70f02652	[build] Fix the `install-bazel.sh` (#25251 ) install-bazel.sh is broken due to the path is not correctly set. This PR fixed it.	2022-06-01 10:52:31 -07:00
Philipp Moritz	f61997d90b	Fix typing of gcs_utils.py and add check to CI (#25285 )	2022-05-31 10:45:42 -07:00
Eric Liang	4963dfaae0	[api] Add API stability annotations for all RLlib symbols and add to LINT (#25060 )	2022-05-24 22:14:25 -07:00
Kai Fricke	6a4b361886	[ludwig] Upgrade jsonschema for ludwig tests (#25155 ) Ludwig 0.5.1 requires jsonschema>4, so we have to install it in the test environment. Related: ludwig-ai/ludwig#2055	2022-05-24 17:05:04 +01:00
mwtian	7013b32d15	[Release] prefer last cluster env version in release tests (#24950 ) Currently the release test runner prefers the first successfully version of a cluster env, instead of the last version. But sometimes a cluster env may build successfully on Anyscale but cannot launch cluster successfully (e.g. version 2 here) or new dependencies need to be installed, so a new version needs to be built. The existing logic always picks up the 1st successful build and cannot pick up the new cluster env version. Although this is an edge case (tweaking cluster env versions, with the same Ray wheel or cluster env name), I believe it is possible for others to run into it. Also, avoid running most of the CI tests for changes under release/ray_release/.	2022-05-24 13:26:54 +01:00
Eric Liang	55d039af32	Annotate datasources and add API annotation check script (#24999 ) Why are these changes needed? Add API stability annotations for datasource classes, and add a linter to check all data classes have appropriate annotations.	2022-05-21 15:05:07 -07:00
Kai Fricke	41b98b1b61	[ci/py310] Fix docker image build/tag (#24922 ) We're currently not building the 3.9/3.10 ray-ml docker images, but we're still trying to tag/push them.	2022-05-18 18:36:37 +01:00
SangBin Cho	fb60d68bbb	[WIP] Run minimal tests against all supported python version (#24830 ) Run minimal CI tests to all Python versions.	2022-05-18 09:42:26 -07:00
Chen Shen	1325cf7876	[python3.10] Build py310 images (#24859 ) Build python 3.10 images so we can run release tests.	2022-05-18 08:48:20 -07:00
Antoni Baum	c74886a55e	[CI] Run doc notebooks in CI (#24816 ) Currently, we are not running doc notebooks in CI due to a bazel misconfiguration - we are using `glob` in a top level package in order to get the paths for the notebooks, but those are contained inside subpackages, which glob purposefully ignores. Therefore, the lists of notebooks to run are empty. This PR fixes that by: * Running the `py_test_run_all_notebooks` macro inside the relevant subpackages * Editing the `test_myst_doc.py` script to allow for recursive search for the target file, allowing to deal with mismatches between `name` and `data` arguments in `py_test_run_all_notebooks` * Setting the `allow_empty=False` flag inside `glob` calls in our macros to ensure that this oversight is caught early * Enabling detection of changes in doc folder for `*.ipynb` and `BUILD` files This PR also adds a GPU runner for doc tests, allowing one of our examples to pass - and setting the infra for more to come. Finally, a misconfigured path for one set of doc tests is also fixed.	2022-05-17 09:50:42 +01:00
Edward Oakes	86422a5e4b	[minor] Fix `black` version check in `ci/lint/format.sh` (#24852 ) The `black` version string differs based on installation method. On my (m1) laptop: ``` $ black --version black, version 21.7b0 ``` On @simon-mo's (intel) and @shrekris-anyscale's (m1) laptops: ``` $ black --version black, 21.12b0 (compiled: no) ``` This adds a conditional in `ci/lint/format.sh` to handle both.	2022-05-16 16:40:21 -05:00
Yi Cheng	684e395c5d	Revert "Revert "[core] Move reconnection to RPC layer for GCS client."" (#24764 ) * Revert "Revert "[core] Move reconnection to RPC layer for GCS client. (#24330)" (#24762)" This reverts commit `30f370bf1f`.	2022-05-14 20:35:40 -07:00
Yi Cheng	68384ec745	[ci] Add flag for staging tests and disable the unstable one. (#24745 ) This PR tries to add a prefix for the staging ci test. This is useful to separate staging tests from stable tests in https://flakey-tests.ray.io/	2022-05-13 13:48:14 -07:00
Kai Fricke	b0fa9d6766	[air] Example for Comet ML (#24603 ) After #24459, this PR will add similar support for model artifact saving and an example for experiment tracking with Ray AIR for Comet ML.	2022-05-12 12:12:30 +01:00
Simon Mo	791ce22feb	[CI] Add conditional build to macOS pipeline (#24671 )	2022-05-10 16:49:03 -07:00
Kai Yang	4a999777fa	[Core] Allow accepting gRPC HTTP proxy via env variable (#23526 )	2022-05-10 11:30:46 +08:00
Kai Fricke	5d9bf4234a	[air] Example to track runs with Weights & Biases (#24459 ) This PR - adds an example on how to run Ray Train and log results to weights & biases - adds functionality to the W&B plugin to store checkpoints - fixes a bug introduced in #24017 - Adds a CI utility script to setup credentials - Adds a CI utility script to remove test state from external services cc @simon-mo	2022-05-06 15:52:37 +01:00
Amog Kamsetty	60ded3ef79	[Docker] Start building `ray-ml` CPU Docker image again (#24266 )	2022-04-28 15:29:23 -07:00
xwjiang2010	d9d9fbb044	[ci] try fixing `ensure pip` by down pinning cryptography. (#24238 ) cryptography had a major release (7 hours ago): https://pypi.org/project/cryptography/#history. Suspecting that it's breaking our docker build step in ci.	2022-04-26 17:48:29 -07:00
Kai Fricke	fc1cd89020	[ci] Add short failing test summary for pytests (#24104 ) It is sometimes hard to find all failing tests in buildkite output logs - even filtering for "FAILED" is cumbersome as the output can be overloaded. This PR adds a small utility to add a short summary log in a separate output section at the end of the buildkite job. The only shared directory between the Buildkite host machine and the test docker container is `/tmp/artifacts:/artifact-mount`. Thus, we write the summary file to this directory, and delete it before actually uploading it as an artifact in the `post-commands` hook.	2022-04-26 22:18:07 +01:00
Amog Kamsetty	ae9c68e75f	[Train] Fully deprecate Ray SGD v1 (#24038 ) Ray SGD v1 has been denoted as a deprecated API for a while. This PR fully deprecates Ray SGD v1. An error will be raised if ray.util.sgd package is attempted to be imported. Closes #16435	2022-04-25 16:12:57 -07:00
Kai Fricke	b86d420a3c	[ci] Only upload wheels to S3 once (#24072 ) Currently all jobs that build wheels put them into the artifacts directory and upload them. This leads to the wheels being overwritten on S3 multiple times. This is not a huge problem as ingress is free, but in order to have a single point of reference, it might be beneficial to limit the wheels uploading to a single Buildkite job. Recently, this has led to interference with stale artifact directories. The downside here is that if the "Wheels & Jars" build fails randomly, the wheels will not be available on S3 - previously they've been also uploaded by several other jobs.	2022-04-25 21:19:11 +01:00
jon-chuang	e6a458a31e	[CI] Create zip of ray `session_latest/logs` dir on test failure and upload to buildkite via `/artifact-mount` (#23783 ) Creates a zip of session_latest dir with test name and timestamp upon python test failure. Writes to dir specified by env var `RAY_TEST_FAILURE_LOGS_DIR`. Noop if env var does not exist. Downstream consumer (e.g. CI) can upload all created artifacts in this dir. Thereby, PR submitters can more easily debug their CI failures, especially if they can't repro locally. Limitations: - a conftest.py file importing the main ray conftest.py needs to be present in same dir as test. This presents a challenge for e.g. dashboard tests which are highly scattered	2022-04-22 09:48:53 +01:00
Guyang Song	0e6c042e29	[Bugfix] fix invalid excluding of Black (#24042 ) - We should use `--force-exclude` when we pass code path explicitly https://black.readthedocs.io/en/stable/usage_and_configuration/the_basics.html?highlight=--force-exclude#command-line-options - Recover the files in `python/ray/_private/thirdparty` which has been formatted in the PR https://github.com/ray-project/ray/pull/21975 by mistake.	2022-04-21 10:21:35 +08:00
Amog Kamsetty	7a3ccb93ee	[CI] Separate out banned words check from formatting script (#23998 ) The recursive grep in the banned words check can get really messy when running locally depending on each person's directory structure or where the format script is being called from. Separates the banned words check as a separate script so that it's not called by default in ./format.sh. Also adds this to the documentation	2022-04-19 13:30:37 -07:00
Jiajun Yao	6e2f9dfe53	[CI] Upload mac wheels to buildkite artifacts (#23930 ) Upload mac wheels to buildkite artifacts and s3.	2022-04-17 13:10:14 -07:00
Kai Fricke	65d9a410f7	[ci] Clean up ci/ directory (refactor ci/travis) (#23866 ) Clean up the ci/ directory. This means getting rid of the travis/ path completely and moving the files into sensible subdirectories. Details: - Moves everything under ci/travis into subdirectories, e.g. ci/build, ci/lint, etc. - Minor adjustments to some scripts (variable renames) - Removes the outdated (unused) asan tests	2022-04-13 18:11:30 +01:00
Amog Kamsetty	38696b155a	[Docker/CI] Add comment on keeping Docker images in sync (#23782 )	2022-04-11 18:09:10 +01:00
Eric Liang	1ff874e8e8	[spelling] Add linter rule for mis-capitalizations of RLLib -> RLlib (#23817 )	2022-04-10 16:12:53 -07:00
Kai Fricke	d27e73f851	[ci] Pin prometheus_client to fix current test outages (#23749 ) What: Pins prometheus_client to < 0.14.0, hopefully fixing today's CI outages Why: New version of the python client (https://github.com/prometheus/client_python/releases) breaks our CI	2022-04-06 14:22:22 -07:00
Kai Fricke	7cf89dd686	[ci] Non-verbose llvm download in Buildkite (#23731 ) What: Use wget -nv in Buildkite environments Why: The llvm download currently clutters the log output as it's not rendered correctly, thus we should silence it. Result: Logs are finally readable again in Buildkite without download: https://buildkite.com/ray-project/ray-builders-pr/builds/28916#25e8965a-d18b-49a1-8e29-200365b13c53	2022-04-05 21:41:51 -07:00
Jiao	ff6515b5a3	Remove `requests` from blacklist of minimal install test (#20584 ) While working on https://github.com/ray-project/ray/pull/20577 we noticed `requests` module is not blacked listed in minimal install test, but not sure why. As a result we missed coverage on P0 issue like https://github.com/ray-project/ray/issues/20574. This is an attempt to see what would happen if we blacklist it and if we're able to get any signals from CI. Co-authored-by: Jiao Dong <jiaodong@anyscale.com> Co-authored-by: Kai Fricke <kai@anyscale.com>	2022-04-04 16:15:58 +01:00
Yi Cheng	31483a003a	[syncer] skip ray_syncer_test on windows temporarily (#23610 ) ray_syncer_test is flaky on windows. It's not so easy to investigate what's happening there. The test timeout somehow. We disable it for short time.	2022-03-30 17:29:08 -07:00
Eric Liang	990b0ec934	Move linkcheck into a separate CI build Why are these changes needed? Linkcheck is inherently flaky, so separate it from the normal LINT build which is never flaky. This also separates the verbose linkcheck logs, making it easier to read the LINT output.	2022-03-29 01:08:53 -07:00
Matti Picus	77c4c1e48e	WINDOWS: enable and fix failures in test_runtime_env_complicated (#22449 )	2022-03-29 00:56:42 -07:00
ddelange	e109c13b83	[ci] Clean up ray-ml requirements (#23325 ) In https://github.com/ray-project/ray/blob/ray-1.11.0/docker/ray-ml/Dockerfile, the order of pip install commands currently matters (potentially a lot). It would be good to run one big pip install command to avoid ending up with a broken env. Co-authored-by: Kai Fricke <krfricke@users.noreply.github.com>	2022-03-25 15:59:54 +00:00
Kai Fricke	940c028540	[ci] Clean up artifacts before/after jobs (#23463 ) We sometimes end up with stale wheel uploads from previous runs of a Buildkite agent. The result is that commit wheels are being overwritten from old build jobs - effectively breaking the wheel build logic. Example: This Agent: https://buildkite.com/organizations/ray-project/agents/4b955117-2f6c-4849-b703-3457daf69f89 - builds wheels (in post-wheels tests) for a35ebc945b - and then runs both the Ray CPP worker and the Train + Tune tests in 6746e9f - Usually these two tests shouldn't provide artifacts at all, but they do - these are the wheels from a35ebc945b though! Meaning these are uncleaned leftovers from the first build task. - See here for proof of artifact upload: https://buildkite.com/ray-project/ray-builders-pr/builds/27622#d11bc514-ebd8-4e0c-a2ce-826b9bad27de The solution is thus to always clean up the artifacts directory in the worker, i.e. `rm -rf /artifact-mount/*` This PR adds two of such clean up instructions - once before commands are run and once after artifacts are uploaded. We can probably just do either, but it doesn't hurt to have both.	2022-03-25 13:07:20 +00:00
Max Pumperla	60054995e6	[docs] fix doctests and activate CI (#23418 )	2022-03-24 17:04:02 -07:00
Dmitri Gekhtman	9ce221f514	Disable KubeRay tests on windows. (#23453 ) This PR disables KubeRay tests on windows, because they're not relevant there.	2022-03-24 08:11:17 -07:00
shrekris-anyscale	b00977b1b1	[serve] Remove dashboard's dependency on Serve (#23389 )	2022-03-21 22:14:41 -07:00
Avnish Narayan	e008a48ef2	[release tests] Pin gym everywhere (#23349 )	2022-03-19 02:52:54 -07:00
Philipp Moritz	886cc4d674	Fix broken links in documentation and put linkcheck linter in place on CI (#23340 )	2022-03-18 21:02:52 -07:00
shrekris-anyscale	56ddea85a1	[Serve] Fix typo `language` (#23213 )	2022-03-16 10:14:44 -07:00
mwtian	6eb805b357	[CI] remove GCS-Ray CI tests (#23149 ) * remove redis ci tests * remove mac	2022-03-14 18:18:59 -07:00
Kai Yang	e9755d87a6	[Lint] One parameter/argument per line for C++ code (#22725 ) It's really annoying to deal with parameter/argument conflicts. This is even frustrating when we merge code from the community to Ant's internal code base with hundreds of conflicts caused by parameters/arguments. In this PR, I updated the clang-format style to make parameters/arguments stay on different lines if they can't fit into a single line. There are several benefits: * Conflict resolving is easier. * Less potential human mistakes when resolving conflicts. * Git history and Git blame are more straightforward. * Better readability. * Align with the new Python format style.	2022-03-13 17:05:44 +08:00

1 2 3 4 5 ...

785 commits