hiro/ray - Forgejo: Beyond coding. We Forge.

hiro/ray

mirror of https://github.com/vale981/ray synced 2025-03-06 10:31:39 -05:00

Author	SHA1	Message	Date
mwtian	48599aef9e	Roll forward to run train_small in client mode. (#16610 )	2021-06-23 08:52:08 +01:00
mwtian	f5f23448fc	Support downloading and testing wheels for Python 3.9. (#16586 )	2021-06-21 12:02:22 -07:00
Chen Shen	853caea146	[tests]migrate test-many-tasks/test-dead-actors to nightly tests (#16469 ) * init commit * Update release/nightly_tests/nightly_tests.yaml Co-authored-by: Kai Fricke <krfricke@users.noreply.github.com> * Update release/nightly_tests/nightly_tests.yaml Co-authored-by: Kai Fricke <krfricke@users.noreply.github.com> Co-authored-by: Kai Fricke <krfricke@users.noreply.github.com>	2021-06-18 18:43:25 -07:00
Kai Fricke	aecc4c8d28	[release] fix sgd base image, microbenchmark timeout, revert xgboost train_small to not use connect (#16532 )	2021-06-18 11:40:04 +01:00
SangBin Cho	6dc4032d19	Set the 500GB block device for a single node test (#16493 )	2021-06-16 22:37:30 -07:00
Kai Fricke	9352cb781c	[release tests] Fix microbenchmark base image, network overhead cluster wait time, add long running tests (#16355 )	2021-06-16 21:37:17 +01:00
mwtian	2f7d535253	[Test] Use Ray client in XGBoost train_small release test (#16319 )	2021-06-16 14:39:32 +01:00
Antoni Baum	2fb10e6730	[SGD] Add support for native Torch AMP in SGD (#16382 ) * SGD native AMP initial commit * SGD native amp second pass * Update docs * Update TorchTrainer doc * Temp fix release test * Update release/sgd_tests/sgd_gpu/sgd_gpu_app_config.yaml Co-authored-by: Amog Kamsetty <amogkam@users.noreply.github.com>	2021-06-15 17:48:21 -07:00
Amog Kamsetty	f3ad50fe6a	[SGD] Rename release tests (#16410 ) Test failures unrelated	2021-06-15 17:16:40 +01:00
SangBin Cho	f3ab162c5e	Fix nightly release test issues. (#16419 )	2021-06-15 00:43:08 -07:00
Eric Liang	f93ca2b673	Make it much simpler to turn on event stats (#16401 )	2021-06-14 09:51:24 -07:00
SangBin Cho	eb7344069b	[Test] Improving tests (#16368 ) * Improve testing * Fix tsets.	2021-06-11 18:29:22 -07:00
matthewdeng	9c36ff81fa	[release] add golden notebook tests for dask/xgboost and modin/xgboost (#16231 ) Co-authored-by: Kai Fricke <krfricke@users.noreply.github.com>	2021-06-11 10:03:04 +01:00
Eric Liang	ae0e38b86d	Remove legacy feature flags / features (#16349 )	2021-06-10 09:31:38 -07:00
SangBin Cho	c8a5d7ba85	[TEST] Additional data processing nightly test (#16078 ) * in progress * in progress * almost done * Lint * almost done * All tests are available now * Change the test a little more stressful * Modify paramter to make tests a little more stressful	2021-06-09 22:38:53 -07:00
Clark Zinzow	ca68bf1e93	[Release] Update release test configs for 1.4 release. (#16292 ) * Updated scalability envelope tests for 1.4. * Update data processing release test for 1.4.	2021-06-08 00:15:25 -07:00
mwtian	c2a2a6f7c3	Make it easier to run asan and wheel release tests (#16242 )	2021-06-07 22:54:22 -07:00
SangBin Cho	3572d0837e	[Test] Dask on ray sort nightly (#16213 ) * Make dask on ray sort works * lint * revert unrelated change	2021-06-06 15:58:48 -07:00
SangBin Cho	03c33cf443	add a streaming shuffl etest (#16258 )	2021-06-06 15:58:14 -07:00
Clark Zinzow	227f252c39	[Release] Release 1.4.0 stress tests, scalability envelope, and microbenchmark release logs (#16228 )	2021-06-04 16:36:41 -07:00
Kai Fricke	153a8b8fec	[release] convert tune release tests (#15913 )	2021-06-01 11:19:15 -07:00
Sven Mika	c9d220bcda	[RLlib] Upgrade RLlib regression test scripts to new testing tool - RLlib release logs for 1.4. (#16080 )	2021-06-01 17:39:18 +02:00
Amog Kamsetty	da6f28d777	[Release] Add multi-node, multi-GPU SGD release test (#16046 )	2021-05-31 16:23:04 -07:00
SangBin Cho	9fa3b9f6f3	[Nightly test] Test non streaming shuffle (#16150 )	2021-05-31 15:28:02 -07:00
SangBin Cho	94dc06d852	[Nightly test] improve error detection (#16102 ) * improve error detection * improve gitignore * fix	2021-05-27 00:33:21 -07:00
SangBin Cho	ee1ccb569d	[Test] Nightly shuffle test (#15998 ) * shuffle daily test update. * lint * Improve testing. * Download the real nightly. * Addressed code review. * fix typo * fix issue * fix the broken release test * Updated the test.	2021-05-24 15:33:31 -07:00
mwtian	5462c6e7de	Fix link to release checklist from release process doc. (#15793 )	2021-05-13 13:34:54 -07:00
SangBin Cho	259fcbd5bd	[Pubsub] Generalize the pubsub interface and adapt it for ref counting protocol (#15446 ) * Add mock code first * In the initial progress. * Fix the number error * In progress. * in more pgoress. * in progress. * lint. * Prototype done. * Fix compilation bug. * Now it is working with reference counting. * Remove template. * lint. * Fixed issues. * Fix reference count test. * Reference count test passes now. * Fixed the test array problem * Addressed code review. * lint. * Addressed half of code review. * Fix tests. * Addressed the most critical issue. * Make subscriber thread-safe. * Revert "Make subscriber thread-safe." This reverts commit 9a6a52197cfa8463ab60dfaae9530ad3c0ed8790. * Fixed test failures. The only failure now is the asan failure. * Reset test suites and see if it fixes the issue. * Fix a flaky test * Addressed code review.	2021-05-13 09:29:02 -07:00
Eric Liang	0dfd43c61b	Add nightly release test directory and add shuffle release test (#15671 ) * update * udpate * update * update * update * Adjust script/release test json * remove * update * lint Co-authored-by: Kai Fricke <kai@anyscale.com>	2021-05-08 14:21:55 -07:00
Kai Fricke	8db2e5c23a	[release] Move xgboost tune small + microbenchmark release test to new release automation (#15619 )	2021-05-08 20:38:39 +01:00
Kai Fricke	1d52ab819f	[release] release 1.3.0 results and test updates (#15366 ) Convert a number of release tests and add logs for release 1.3.0	2021-05-04 22:10:04 +01:00
Jenna Kwon	15da948214	Support object spilling mode and data load failure mode in dask_on_ra… (#15601 ) * Support object spilling mode and data load failure mode in dask_on_ray_large_scale_test.py * Remove freq and time decimation Co-authored-by: Jenna Kwon <jkkwon@amazon.com>	2021-05-04 10:57:49 -07:00
Amog Kamsetty	ebc44c3d76	[CI] Upgrade flake8 to 3.9.1 (#15527 ) * formatting * format util * format release * format rllib/agents * format rllib/env * format rllib/execution * format rllib/evaluation * format rllib/examples * format rllib/policy * format rllib utils and tests * format streaming * more formatting * update requirements files * fix rllib type checking * updates * update * fix circular import * Update python/ray/tests/test_runtime_env.py * noqa	2021-05-03 14:23:28 -07:00
SangBin Cho	df9329160e	[Tests] Dask on ray release test (#15256 ) * done. * Linting. * Update readme * Update. * Fix issues.	2021-04-15 10:30:17 -07:00
SangBin Cho	d0e83c43ca	[Release Test] Modify parameter to reduce stress (#15048 ) * Fix. * Fix.	2021-04-14 18:27:20 -07:00
Richard Liaw	59bf3a7b22	ray[cluster] -> ray[default] (#15251 )	2021-04-14 09:37:04 -07:00
Edward Oakes	0f9d1bb223	Serve failure release test fix (#15276 ) This test is currently not tested in CI	2021-04-13 17:49:29 +01:00
Edward Oakes	e4ca337e16	[serve] Change remaining tests to use deployment API (#15167 )	2021-04-08 08:15:38 -05:00
Richard Liaw	e72f6b0377	Fix ray[full] -> ray[cluster] #15112 Signed-off-by: Richard Liaw <rliaw@berkeley.edu>	2021-04-05 09:55:00 -07:00
Kai Fricke	b366500938	[tune] fix long running release test WIP (#14866 ) - Use placement groups - Introduce time between checks for failure testing - Use gloo instead of nccl	2021-03-25 11:03:22 +01:00
Amog Kamsetty	233f174984	Update release instructions (#14882 )	2021-03-24 12:41:50 -07:00
SangBin Cho	5f7ce293fe	[Test] Large scale dask on ray test (#14340 ) * Add a test. * Add a test. * d * Modify the release doc. * Addressed code review.	2021-03-23 11:00:35 -07:00
Kai Fricke	7364a7a327	[tune] Move Optuna to ask(fixed_distributions) interface (#14731 ) Adjusting to changes in Optuna 2.6.0. Old interface was marked as deprecated.	2021-03-22 12:25:37 +01:00
Ian Rodney	eb12033612	[Code Cleanup] Switch to use ray.util.get_node_ip_address() (#14741 ) Co-authored-by: Richard Liaw <rliaw@berkeley.edu>	2021-03-18 13:10:57 -07:00
Kai Fricke	4014168928	[tune] Introduce `durable()` wrapper to convert trainables into durable trainables (#14306 ) * [tune] Introduce `durable()` wrapper to convert trainables into durable trainables * Fix wrong check * Improve docs, add FAQ for tackling overhead * Fix bugs in `tune.with_parameters` * Update doc/source/tune/api_docs/trainable.rst Co-authored-by: Richard Liaw <rliaw@berkeley.edu> * Update doc/source/tune/_tutorials/_faq.rst Co-authored-by: Richard Liaw <rliaw@berkeley.edu> Co-authored-by: Richard Liaw <rliaw@berkeley.edu>	2021-02-26 13:59:28 +01:00
SangBin Cho	5740b2391e	Add multi node data processing cluster.yaml (#14198 )	2021-02-19 16:16:55 -08:00
Kai Fricke	a0f73cf3f7	[xgboost] Update XGBoost release test configs (#13941 ) * Update XGBoost release test configs * Use GPU containers * Fix elastic check * Use spot instances for GPU * Add debugging output * Fix success check, failure checking, outputs, sync behavior * Update release checklist, rename mounts	2021-02-17 23:00:49 +01:00
Alex Wu	4846a6c2d0	Release process update (#13798 )	2021-02-15 11:40:49 -08:00
Kai Fricke	1ef2a6790c	[tune] add scalability release tests (#13986 ) * Add scalability tests * Network overhead cluster * Update xgboost tests * Document release tests * Don't raise on failed trial * Update to multi node yamls * Update yamls * Revert xgboost test changes * Fix import * Update release/tune_tests/scalability_tests/workloads/test_bookkeeping_overhead.py Co-authored-by: Richard Liaw <rliaw@berkeley.edu> * Pass aws credentials (WIP) * Update durable trainable example * Update xgboost sweep * Change xgboost scope, fix durable trainable stop condition * Fix max depth to limit total test length * Add cluster information to test descriptions. Update release checklist/process docs Co-authored-by: Richard Liaw <rliaw@berkeley.edu>	2021-02-10 17:16:31 +01:00
Kai Fricke	1e113d2e6e	[tune/xgboost] Update release test docs (#13880 ) * Update release test docs * Update	2021-02-04 13:10:56 +01:00

1 2

84 commits