hiro/ray - Forgejo: Beyond coding. We Forge.

hiro/ray

mirror of https://github.com/vale981/ray synced 2025-03-06 10:31:39 -05:00

Author	SHA1	Message	Date
mwtian	cf6a54ca46	[CI] pin pytest-asyncio (#21579 )	2022-01-13 11:35:30 -08:00
Kai Fricke	a3442df584	[ci/multinode] Build multinode image with OpenSSH before running tests (#21544 ) Currently we install OpenSSH on the fly in fake multinode docker testing. Instead we can speed testing up a fair bit by building a Docker image which includes OpenSSH first and then run tests with this image.	2022-01-13 08:47:04 -08:00
Kai Fricke	5a7f6e4fdd	[rfc][ci] create fake docker-compose cluster environment (#20256 ) Following #18987 this PR adds a docker-compose based local multi node cluster. The fake multinode docker comprises two parts. The docker_monitor.py script is a watch script calling docker compose up whenever the docker-compose.yaml changes. The node provider creates and updates the docker compose according to the autoscaling requirements. This mode fully supports autoscaling and comes with test utilities to start and connect to docker-compose autoscaling environments. There's also a sample test case showing how this can be used.	2022-01-11 04:35:36 +00:00
Matti Picus	f3dcd1fac1	WINDOWS: re-enable runtime_env tests, skip cluster tests in serve (#21398 ) After enabling tests of test_runtime_env_plugin and test_runtime_env_env_vars (PR #21252) and python/ray/serve:* tests (PR #21107), the analysis at flaky-tests.ray.io starting showing failing tests in the windows://python/ray/test/serv:test_standalone. PR #21352 reverted 21252 (runtime_env tests), but the problem was more likely in the serve tests. Specifically `test_standalone` has a test that uses Cluster, which should be skipped on windows because it is flaky. So this PR - re-enables the runtime_env tests for windows - skips the Cluster test in serve/tests/test_standalone.py	2022-01-06 21:43:58 -08:00
Archit Kulkarni	fd02065ce5	[CI] [docker] Fix docker image name regex matching (#21409 )	2022-01-05 18:59:10 -08:00
Ian Rodney	1b42a49e71	[CI] [Docker Build] Allow Branches with Double digits in regex matching(#21401 )	2022-01-05 14:19:19 -08:00
mwtian	24da654d90	[Test] Shard "Small & Large" tests (#21351 )	2022-01-05 10:49:14 -08:00
Kai Fricke	94242e3e6e	[ci/repro] Add SYS_PTRACE to docker container, use unique name (#21377 ) This will start repro docker containers with SYS_PTRACE capabilities to enable debugging e.g. via py-spy. Additionally, default instance name tags for instance re-use will be generated using the buildkite build id and job id.	2022-01-04 16:59:12 +00:00
Archit Kulkarni	4581baa7dc	Revert "WINDOWS: unskip passing runtime_env tests (#21252 )" (#21352 ) This reverts commit `fcb952e1bc`.	2022-01-03 11:07:17 -08:00
Balaji Veeramani	43a9e95dc0	[CI] Add support for Black formatting (#21281 )	2022-01-03 10:06:41 -08:00
Kai Fricke	10290eeb2f	[ci] Pin manylinux docker image (#21341 )	2022-01-03 14:36:21 +00:00
Kai Fricke	14ed7cfaaa	[ci] Add repro-ci.py script to automatically setup Buildkite-runner-like instances to debug CI runs (#21292 ) Create an AWS instance to reproduce Buildkite CI builds. This script will take a Buildkite build URL as an argument and create an AWS instance with the same properties running the same Docker container as the original Buildkite runner. The user is then attached to this instance and can reproduce any builds commands as if they were executed within the runner. This utility can be used to reproduce and debug build failures that come up on the Buildkite runner instances but not on a local machine.	2021-12-31 10:31:50 +00:00
Matti Picus	3de18d2ada	WINDOWS: enable passing/skipping tests (#21136 )	2021-12-27 11:59:00 -08:00
Matti Picus	fcb952e1bc	WINDOWS: unskip passing runtime_env tests (#21252 )	2021-12-26 20:49:02 -08:00
Akash Patel	cbcd03b779	Upgrade cython to 0.29.26 for py310 (#21244 )	2021-12-26 20:26:08 -08:00
Gagandeep Singh	c5c5fec22b	Unskip `test_standalone` from ci.sh (#21235 )	2021-12-25 00:21:58 -08:00
Simon Mo	cfe0897d05	[CI] Migrate Windows tests to Buildkite (#21227 )	2021-12-21 20:16:34 -08:00
Amog Kamsetty	57db4640ca	[Train] [Tune] Refactor MLflow (#20802 ) Pulls out Tune's MLflow logging logic to a shared MLflow util. Adds an MLflow logger callback to Ray Train Closes #20642	2021-12-21 17:17:52 -08:00
Simon Mo	956774e757	[CI] Disable serve test_standalone on windows again (#21154 )	2021-12-17 10:32:27 -08:00
Matti Picus	29965ad325	enable passing serve tests on windows (#21107 ) * enable passing serve tests on windows * move test_handle to 'medium' and enable' * move test_cli to 'medium'	2021-12-16 14:03:11 -08:00
Matti Picus	d2cd0730a0	[Windows] Enable test_advanced_2 on windows (#20994 )	2021-12-15 14:30:40 -08:00
Ian Rodney	c7fb5a94d1	[CI] Upgrade Pip to 21.3 (#21111 )	2021-12-15 13:29:45 -08:00
Edward Oakes	10947c83b3	[runtime_env] Make pip installs incremental (#20341 ) Uses a direct `pip install` instead of creating a conda env to make pip installs incremental to the cluster environment. Separates the handling of `pip` and `conda` dependencies. The new `pip` approach still works if only the base Ray is installed on the cluster and the user specifies libraries like "ray[serve]" in the `pip` field. The mechanism is as follows: - We don't actually want to reinstall ray via pip, since this could lead to version mismatch issues. Instead, we want to use the Ray that's already installed in the cluster. - So if "ray" was included by the user in the pip list, remove it - If a library "ray[serve]" or "ray[tune, rllib]" was included in the pip list, remove it and replace it by its dependencies (e.g. "uvicorn", "requests", ..) Co-authored-by: architkulkarni <arkulkar@gmail.com> Co-authored-by: architkulkarni <architkulkarni@users.noreply.github.com>	2021-12-14 15:55:18 -08:00
Matti Picus	aec04989fc	WINDOWS: enable test_advanced_3.py (#21056 )	2021-12-14 09:25:23 -08:00
Eric Liang	6f93ea437e	Remove the flaky test tag (#21006 )	2021-12-11 01:03:17 -08:00
Kai Fricke	97ec2a03b6	[ci/buildkite] Add ml pipeline to speed up ML/RLLib tests (#20895 ) ML tests will be built in a separate bootstrap step installing all required dependencies.	2021-12-09 21:14:10 +00:00
Amog Kamsetty	611bfc1352	[ML] Move `find_free_port` to `ml_utils` (#20828 ) Small refactoring of common utility used by Train, Tune, and Rllib.	2021-12-03 13:38:42 -08:00
matthewdeng	0de105d42f	[train] update Trainer._is_tune_enabled to work when Tune is not installed (#20767 )	2021-11-29 20:08:51 -08:00
Guyang Song	191be85057	[script][format] check copyright for .proto files (#20632 ) ## Why are these changes needed? - I found that we also have a copyright header in .proto files. Add it to the copyright formatter.	2021-11-23 12:26:30 +08:00
Simon Mo	add2450b92	[CI] [Hotfix] Skip test_standalone (#20556 )	2021-11-18 16:47:18 -08:00
shrekris-anyscale	a91ddbdeb9	Add `smart_open` dependency to `ray[default]` (#20420 )	2021-11-18 10:00:30 -06:00
Amog Kamsetty	4cbcb11458	[Docker] Add commit as label (#20504 ) Adds the Ray commit sha as a label for the docker image.	2021-11-17 15:20:41 -08:00
Richard Liaw	1cadd61917	Fix horovod failing tests by pinning down (#20484 )	2021-11-17 13:54:25 -08:00
Simon Mo	18d605fa7c	[Serve] Add experimental CLI for `serve deploy` (#20371 )	2021-11-16 20:22:09 -08:00
Simon Mo	2dc7a6c9f8	[CI] Pin manylinux image (#20451 )	2021-11-16 17:52:51 -08:00
Philipp Moritz	440da92263	Fix manylinux2014 build scripts (#20347 )	2021-11-14 19:42:23 -08:00
Edward Oakes	73e570c426	Fix windows build (don't skip test_job_manager.py) (#20294 )	2021-11-12 11:13:15 -08:00
Matti Picus	1e80a2a83a	[WINDOWS] unskip tests (#20212 )	2021-11-12 10:11:11 -08:00
chenk008	74fa267c72	Enable worker in container CI test (#20174 )	2021-11-11 16:11:06 -08:00
Teofilo Zosa	abf0eb53cc	Fix aiohttp 3.8.0 breaking changes (and unpin from 3.7) (#20261 )	2021-11-11 15:35:20 -08:00
Alex Wu	d85f7f3bfa	[windows][ci] Skip test_multinode_failures_2.py (typo) (#20206 )	2021-11-10 12:05:45 -08:00
architkulkarni	e5e62d8991	[runtime env] Fix runtime env conda test and enable it in CI (#20121 )	2021-11-08 18:33:19 -08:00
Alex Wu	45d7ef7c08	[windows][ci] Skip test_multi_node_failure_2 (#20117 )	2021-11-07 09:17:46 -08:00
Sven Mika	50c30f89c6	[Tune; RLlib] Move Tune tests that use RLlib into separate buildkite job. (#20016 )	2021-11-04 20:40:57 +01:00
Jiao	6cfb52ff1d	[job submission] Add stop API + subprocess cleanup (#19860 )	2021-11-04 13:59:47 -05:00
Sven Mika	4cb23d1c95	[Tune; Testing] Revert to 3.7 (undone by accident by previous PR); + some minor comment cleanups. (#20031 )	2021-11-04 10:58:34 +01:00
Avnish Narayan	026bf01071	[RLlib] Upgrade gym version to 0.21 and deprecate pendulum-v0. (#19535 ) * Fix QMix, SAC, and MADDPA too. * Unpin gym and deprecate pendulum v0 Many tests in rllib depended on pendulum v0, however in gym 0.21, pendulum v0 was deprecated in favor of pendulum v1. This may change reward thresholds, so will have to potentially rerun all of the pendulum v1 benchmarks, or use another environment in favor. The same applies to frozen lake v0 and frozen lake v1 Lastly, all of the RLlib tests and have been moved to python 3.7 * Add gym installation based on python version. Pin python<= 3.6 to gym 0.19 due to install issues with atari roms in gym 0.20 * Reformatting * Fixing tests * Move atari-py install conditional to req.txt * migrate to new ale install method * Fix QMix, SAC, and MADDPA too. * Unpin gym and deprecate pendulum v0 Many tests in rllib depended on pendulum v0, however in gym 0.21, pendulum v0 was deprecated in favor of pendulum v1. This may change reward thresholds, so will have to potentially rerun all of the pendulum v1 benchmarks, or use another environment in favor. The same applies to frozen lake v0 and frozen lake v1 Lastly, all of the RLlib tests and have been moved to python 3.7 * Add gym installation based on python version. Pin python<= 3.6 to gym 0.19 due to install issues with atari roms in gym 0.20 Move atari-py install conditional to req.txt migrate to new ale install method Make parametric_actions_cartpole return float32 actions/obs Adding type conversions if obs/actions don't match space Add utils to make elements match gym space dtypes Co-authored-by: Jun Gong <jungong@anyscale.com> Co-authored-by: sven1977 <svenmika1977@gmail.com>	2021-11-03 16:24:00 +01:00
Sven Mika	e6ae08f416	[RLlib] Optionally don't drop last ts in v-trace calculations (APPO and IMPALA). (#19601 )	2021-11-03 10:01:34 +01:00
Simon Mo	6040319d02	[CI] Pin aiohttp version to fix master branch (#19948 )	2021-11-01 23:00:08 -07:00
mwtian	7afdfdc6dd	[CI] narrow down tests that run when files change (#19656 )	2021-10-29 16:47:54 -07:00

1 2 3 4 5 ...

708 commits