hiro/ray - Forgejo: Beyond coding. We Forge.

hiro/ray

mirror of https://github.com/vale981/ray synced 2025-03-06 18:41:40 -05:00

Author	SHA1	Message	Date
Kai Fricke	d06c3ffd6f	[release] Migrate Tune + XGBoost tests to new infrastructure (#22705 ) Migrate XGBoost and Tune tests to new release testing infrastructure. https://buildkite.com/ray-project/release-tests-branch/builds/50	2022-03-01 08:10:06 +01:00
SangBin Cho	2c1184592e	mark threaded actor test unstable (#22696 )	2022-02-28 15:25:14 -08:00
Clark Zinzow	cf3577f0ee	[Datasets] Patch Parquet file fragment serialization to prevent metadata fetching. (#22665 )	2022-02-28 15:15:30 -08:00
Chen Shen	7e90700521	[Dataset][nighly-test] promote data ingestion test to stable #22702	2022-02-28 14:00:18 -08:00
Kai Fricke	3695408a85	[release] Fix special cases in release test package (e.g. smoke test) (#22442 ) Fixing special cases (e.g. smoke tests, long running tests) in the release test package infrastructure. Prepare migration of Tune and XGBoost tests.	2022-02-28 21:05:01 +01:00
SangBin Cho	1cedb1b6e4	[Test] Increase timeout for microbenchmark (#22655 )	2022-02-25 17:29:12 -08:00
Sven Mika	7b687e6cd8	[RLlib] SlateQ: Add a hard-task learning test to weekly regression suite. (#22544 )	2022-02-25 21:58:16 +01:00
Archit Kulkarni	31332f8930	[serve] [release tests] Add health check grace period for 1k deployment (#22651 )	2022-02-25 12:13:44 -06:00
Archit Kulkarni	1165f99b0b	[CI] disable Serve microbenchmark k8s (#22631 )	2022-02-24 16:50:06 -08:00
Yi Cheng	de76d86bcb	[nightly] Stop GCS HA related nightly test (#22636 ) Since we've already turned it on on master, we should stop these tests for now.	2022-02-24 16:40:08 -08:00
Jun Gong	99b7be5e22	[rllib] Fix impala long running test (#22619 ) fix impala long running test. Bandits is the first agent that requires torch import at registration time.	2022-02-24 09:03:55 -08:00
SangBin Cho	5e847f7e09	[Usage Stats] Usage stats only enabled on nightly test infra (#22591 ) This PR enables the usage stats only on the release test infrastructure (large scale tests Ray runs on a daily basis in a private infra). Note it is still disabled by default in Ray.	2022-02-23 22:11:48 -08:00
Eric Liang	e15a419028	Enable stage fusion by default for dataset pipelines (#22476 ) This PR enables stage fusion for dataset pipelines. This also requires: 1. Removing the num_cpus=0.5 default for the read stage, to enable fusion of the read stage. 2. Removing spread_resource_prefix (not supported for now).	2022-02-23 17:34:05 -08:00
Max Pumperla	29d94a2211	[docs] sphinx gallery removal, migrate to ipynb (#22467 )	2022-02-19 01:19:07 -08:00
Jiajun Yao	baa14d695a	Round robin during spread scheduling (#21303 ) - Separate spread scheduling and default hydra scheduling (i.e. SpreadScheduling != HybridScheduling(threshold=0)): they are already separated in the API layer and they have the different end goals so it makes sense to separate their implementations and evolve them independently. - Simple round robin for spread scheduling: this is just a starting implementation, can be optimized later. - Prefer not to spill back tasks that are waiting for args since the pull is already in progress.	2022-02-18 15:05:35 -08:00
Stephanie Wang	03a5589591	[core] Enable lineage reconstruction in CI (#21519 ) Enables lineage reconstruction in all CI and release tests.	2022-02-18 11:04:20 -08:00
Chen Shen	17f589a05d	[Dataset][nighlty-test] use 2 instead of 15 windows for 1.5TB data ingestion #22479	2022-02-17 15:20:39 -08:00
mwtian	05dd72101b	[Release 1.11.0] Release logs for 1.11.0rc1 (#22443 ) This is the release log for 1.11.0rc1, with GCS-Ray enabled. The diff is against 1.11.0rc0, without GCS-Ray.	2022-02-16 17:03:49 -08:00
Chen Shen	30ec0df9cc	[placement group] fix pg benchmark regression #22441 We added a warmup time in timeit which affects the pg benchmark time accounting. add an option to cancel warmup.	2022-02-16 16:24:51 -08:00
Jun Gong	a9147bb62c	[Release Test] Fix AnyscaleSDK construction so we can run CI on staging instance. (#22325 )	2022-02-16 09:56:02 -08:00
SangBin Cho	42361a1801	[Test] Fix Dask on Ray 1 TB bug #22431 Open Fixes a bug. It seems like not df is not working with dataframe	2022-02-17 02:44:36 +09:00
Kai Fricke	331b71ea8d	[ci/release] Refactor release test e2e into package (#22351 ) Adds a unit-tested and restructured ray_release package for running release tests. Relevant changes in behavior: Per default, Buildkite will wait for the wheels of the current commit to be available. Alternatively, users can a) specify a different commit hash, b) a wheels URL (which we will also wait for to be available) or c) specify a branch (or user/branch combination), in which case the latest available wheels will be used (e.g. if master is passed, behavior matches old default behavior). The main subpackages are: Cluster manager: Creates cluster envs/computes, starts cluster, terminates cluster Command runner: Runs commands, e.g. as client command or sdk command File manager: Uploads/downloads files to/from session Reporter: Reports results (e.g. to database) Much of the code base is unit tested, but there are probably some pieces missing. Example build (waited for wheels to be built): https://buildkite.com/ray-project/kf-dev/builds/51#_ Wheel build: https://buildkite.com/ray-project/ray-builders-branch/builds/6023	2022-02-16 17:35:02 +00:00
SangBin Cho	2ed5bb7a5f	[Nightly Test] Addressed client failure properly (#22438 ) When the client returns the code that's not 0, we should raise RuntimeError to properly propagate errors	2022-02-16 09:03:17 -08:00
Jun Gong	04dd536987	[Release tests] Disable A3C CI tests on torch for now. Also extend performance_test deadline to 3hrs. (#22426 )	2022-02-16 13:06:09 +01:00
Kai Fricke	c866131cc0	[tune] Retry cloud sync up/down/delete on fail (#22029 )	2022-02-15 12:27:29 +00:00
SangBin Cho	640d92c385	It seems like the S3 read sometimes fails; #22214 . I found out the file actually does exist in S3, so it is highly likely a transient error. This PR adds a retry mechanism to avoid the issue. It seems like the S3 read sometimes fails; #22214. I found out the file actually does exist in S3, so it is highly likely a transient error. This PR adds a retry mechanism to avoid the issue.	2022-02-12 11:58:58 +09:00
Jun Gong	cbd24503b6	[RLlib] Add A3C to RLlib performance regression tests. (#22316 )	2022-02-11 21:18:53 +01:00
Archit Kulkarni	da57012cbc	Add comment to periodic CI pipeline to update release process doc when updating test suites (#22037 ) This PR adds a comment to build_pipeline.py reminding anyone who makes changes to the test suites to also update the release process doc if necessary. This is an action item from the Ray 1.10.0 release retrospective.	2022-02-11 11:14:24 -06:00
Chen Shen	0866a5558f	[Dataset][nighlyt-test] pin pyarrow==4.0.1 for dataset related tests (#22277 ) * pin pyarrow==4.0.1 * address comments	2022-02-10 14:22:41 -08:00
Sven Mika	04a5c72ea3	Revert "Revert "[RLlib] Speedup A3C up to 3x (new training_iteration function instead of execution_plan) and re-instate Pong learning test."" (#18708 )	2022-02-10 13:44:22 +01:00
mwtian	47a56ca062	[Release] Add release logs for 1.11.0rc0 (GCS KV & pubsub not enabled) (#22041 )	2022-02-10 00:03:31 -08:00
SangBin Cho	30000ff8ae	Fix a bug from many drivers. (#22248 ) After this PR (https://github.com/ray-project/ray/pull/22156), for some reasons the driver script has some string that cannot be encoded with ascii. It seems like using utf-8 solves the problem.	2022-02-09 15:17:15 -08:00
Alex Wu	b122f093c1	Revert "[RLlib] Speedup A3C up to 3x (new `training_iteration` function instead of `execution_plan`) and re-instate Pong learning test." (#22250 ) Reverts ray-project/ray#22126 Breaks rllib:tests/test_io	2022-02-09 09:26:36 -08:00
Yi Cheng	8b1bbfe8e4	[e2e] Fix an error when "env_vars" is not set. (#22234 ) To fix error in session https://buildkite.com/ray-project/periodic-ci/builds/2699#c532ed2b-ee89-48ad-a7db-fd4211ef8bd9	2022-02-08 22:05:53 -08:00
Yi Cheng	d8ac01bd5c	[e2e] Update e2e test to use redisless ray by default. (#22189 ) As title, after infra got updated, we need to merge the PR so that test can run ray without redis.	2022-02-08 19:46:48 -08:00
Sven Mika	ac3e6ab411	[RLlib] Speedup A3C up to 3x (new `training_iteration` function instead of `execution_plan`) and re-instate Pong learning test. (#22126 )	2022-02-08 19:04:13 +01:00
SangBin Cho	ac00389cbe	[Nightly test] Bring back the old way of running commands. (#22209 ) Bring back the old way of running commands for non-k8s tests. This also fixes the regression from many_drivers.py	2022-02-08 01:44:07 -08:00
Jiajun Yao	56c7b74072	Delete nightly shuffle_data_loader (#22185 )	2022-02-07 15:23:34 -08:00
Eric Liang	00b5801d71	Fix datasets leaking worker processes due to closure capture of stats actor handle (#22156 )	2022-02-07 14:05:44 -08:00
Jiajun Yao	355ee4a02c	Fix nightly shuffle_data_loader by pinning down dependencies versions (#22183 )	2022-02-07 11:25:30 -08:00
Chen Shen	13819304d4	[Core][nightly-test] better way of calculating num features (#22158 ) * better filter of column length * address comments * more	2022-02-07 02:13:40 -08:00
Kai Fricke	dd935874ee	[ci/release] Fix job submission command (#22093 ) Ray job submission does not accept quoted commands anymore (#22011). This PR updates the command to fix job submission within e2e tests.	2022-02-04 00:05:52 +01:00
mwtian	b528bf9202	Revert "[e2e] Remove unnecessary logic around copying results (#22034 )" (#22088 ) This reverts commit `92d7e9bf98`.	2022-02-03 13:42:40 -08:00
mwtian	92d7e9bf98	[e2e] Remove unnecessary logic around copying results (#22034 ) After #21905, some of the logic around handling result artifacts become unnecessary or incorrect (in generating error logs). They are removed.	2022-02-03 12:15:06 -08:00
SangBin Cho	3c056a6b92	Revert "[Nightly Test] Add more metadata to test result (#21990 )" (#22052 ) This reverts commit `fd20cf3239`.	2022-02-02 12:56:42 -08:00
SangBin Cho	fd20cf3239	[Nightly Test] Add more metadata to test result (#21990 ) Add a columns, error code, commit url, stable, session url, and runtime	2022-01-31 22:33:30 -08:00
Yi Cheng	0659d4a472	[nightly] Limit many drivers iteration to 4000 iterations (#21958 ) Due to faster running of many drivers, we limit the iteration to 4k for the test.	2022-01-31 13:26:02 -08:00
Balaji Veeramani	7f1bacc7dc	[CI] Format Python code with Black (#21975 ) See #21316 and #21311 for the motivation behind these changes.	2022-01-29 18:41:57 -08:00
Yi Cheng	570f67798a	[nightly] Move scheduling tests into one suite (#21959 ) For future convenience, we are moving scheduling-related tests into one suite for easier monitoring and benchmarking.	2022-01-28 13:32:34 -08:00
Chen Shen	bfe3e5f4a8	add check on shape (#21947 )	2022-01-28 12:27:43 -08:00

... 4 5 6 7 8 ...

716 commits