Amog Kamsetty
c0560dadef
[Docker] Pin Tensorflow ( #16741 )
2021-06-29 11:14:46 -07:00
Dmitri Gekhtman
257d072d13
[kubernetes][release] K8s release test instructions ( #16662 )
2021-06-29 10:57:35 -07:00
matthewdeng
b0f304a1b5
[release] add golden notebook release test for torch/tune/serve ( #16619 )
...
* [release] add golden notebook release test for torch/tune/serve
* start serve on all nodes so remote localhost works
2021-06-29 09:13:23 -07:00
Jiao
6aeda62d40
[Serve] Add serve test config files and wrk dependency ( #16631 )
2021-06-28 10:01:55 -07:00
Chen Shen
c4d7b31a79
[Test] Placement group stress test ( #16633 )
2021-06-24 21:35:55 -07:00
Amog Kamsetty
53d16365b0
[Release] Convert Horovod and SGD release tests ( #15999 )
2021-06-24 15:56:02 +01:00
Kai Fricke
ef97bdd407
[release] Fix app config: Install latest releases. Bump xgboost-ray version ( #16581 )
2021-06-24 12:56:21 +01:00
mwtian
48599aef9e
Roll forward to run train_small in client mode. ( #16610 )
2021-06-23 08:52:08 +01:00
mwtian
f5f23448fc
Support downloading and testing wheels for Python 3.9. ( #16586 )
2021-06-21 12:02:22 -07:00
Chen Shen
853caea146
[tests]migrate test-many-tasks/test-dead-actors to nightly tests ( #16469 )
...
* init commit
* Update release/nightly_tests/nightly_tests.yaml
Co-authored-by: Kai Fricke <krfricke@users.noreply.github.com>
* Update release/nightly_tests/nightly_tests.yaml
Co-authored-by: Kai Fricke <krfricke@users.noreply.github.com>
Co-authored-by: Kai Fricke <krfricke@users.noreply.github.com>
2021-06-18 18:43:25 -07:00
Kai Fricke
aecc4c8d28
[release] fix sgd base image, microbenchmark timeout, revert xgboost train_small to not use connect ( #16532 )
2021-06-18 11:40:04 +01:00
SangBin Cho
6dc4032d19
Set the 500GB block device for a single node test ( #16493 )
2021-06-16 22:37:30 -07:00
Kai Fricke
9352cb781c
[release tests] Fix microbenchmark base image, network overhead cluster wait time, add long running tests ( #16355 )
2021-06-16 21:37:17 +01:00
mwtian
2f7d535253
[Test] Use Ray client in XGBoost train_small release test ( #16319 )
2021-06-16 14:39:32 +01:00
Antoni Baum
2fb10e6730
[SGD] Add support for native Torch AMP in SGD ( #16382 )
...
* SGD native AMP initial commit
* SGD native amp second pass
* Update docs
* Update TorchTrainer doc
* Temp fix release test
* Update release/sgd_tests/sgd_gpu/sgd_gpu_app_config.yaml
Co-authored-by: Amog Kamsetty <amogkam@users.noreply.github.com>
2021-06-15 17:48:21 -07:00
Amog Kamsetty
f3ad50fe6a
[SGD] Rename release tests ( #16410 )
...
Test failures unrelated
2021-06-15 17:16:40 +01:00
SangBin Cho
f3ab162c5e
Fix nightly release test issues. ( #16419 )
2021-06-15 00:43:08 -07:00
Eric Liang
f93ca2b673
Make it much simpler to turn on event stats ( #16401 )
2021-06-14 09:51:24 -07:00
SangBin Cho
eb7344069b
[Test] Improving tests ( #16368 )
...
* Improve testing
* Fix tsets.
2021-06-11 18:29:22 -07:00
matthewdeng
9c36ff81fa
[release] add golden notebook tests for dask/xgboost and modin/xgboost ( #16231 )
...
Co-authored-by: Kai Fricke <krfricke@users.noreply.github.com>
2021-06-11 10:03:04 +01:00
Eric Liang
ae0e38b86d
Remove legacy feature flags / features ( #16349 )
2021-06-10 09:31:38 -07:00
SangBin Cho
c8a5d7ba85
[TEST] Additional data processing nightly test ( #16078 )
...
* in progress
* in progress
* almost done
* Lint
* almost done
* All tests are available now
* Change the test a little more stressful
* Modify paramter to make tests a little more stressful
2021-06-09 22:38:53 -07:00
Clark Zinzow
ca68bf1e93
[Release] Update release test configs for 1.4 release. ( #16292 )
...
* Updated scalability envelope tests for 1.4.
* Update data processing release test for 1.4.
2021-06-08 00:15:25 -07:00
mwtian
c2a2a6f7c3
Make it easier to run asan and wheel release tests ( #16242 )
2021-06-07 22:54:22 -07:00
SangBin Cho
3572d0837e
[Test] Dask on ray sort nightly ( #16213 )
...
* Make dask on ray sort works
* lint
* revert unrelated change
2021-06-06 15:58:48 -07:00
SangBin Cho
03c33cf443
add a streaming shuffl etest ( #16258 )
2021-06-06 15:58:14 -07:00
Clark Zinzow
227f252c39
[Release] Release 1.4.0 stress tests, scalability envelope, and microbenchmark release logs ( #16228 )
2021-06-04 16:36:41 -07:00
Kai Fricke
153a8b8fec
[release] convert tune release tests ( #15913 )
2021-06-01 11:19:15 -07:00
Sven Mika
c9d220bcda
[RLlib] Upgrade RLlib regression test scripts to new testing tool - RLlib release logs for 1.4. ( #16080 )
2021-06-01 17:39:18 +02:00
Amog Kamsetty
da6f28d777
[Release] Add multi-node, multi-GPU SGD release test ( #16046 )
2021-05-31 16:23:04 -07:00
SangBin Cho
9fa3b9f6f3
[Nightly test] Test non streaming shuffle ( #16150 )
2021-05-31 15:28:02 -07:00
SangBin Cho
94dc06d852
[Nightly test] improve error detection ( #16102 )
...
* improve error detection
* improve gitignore
* fix
2021-05-27 00:33:21 -07:00
SangBin Cho
ee1ccb569d
[Test] Nightly shuffle test ( #15998 )
...
* shuffle daily test update.
* lint
* Improve testing.
* Download the real nightly.
* Addressed code review.
* fix typo
* fix issue
* fix the broken release test
* Updated the test.
2021-05-24 15:33:31 -07:00
mwtian
5462c6e7de
Fix link to release checklist from release process doc. ( #15793 )
2021-05-13 13:34:54 -07:00
SangBin Cho
259fcbd5bd
[Pubsub] Generalize the pubsub interface and adapt it for ref counting protocol ( #15446 )
...
* Add mock code first
* In the initial progress.
* Fix the number error
* In progress.
* in more pgoress.
* in progress.
* lint.
* Prototype done.
* Fix compilation bug.
* Now it is working with reference counting.
* Remove template.
* lint.
* Fixed issues.
* Fix reference count test.
* Reference count test passes now.
* Fixed the test array problem
* Addressed code review.
* lint.
* Addressed half of code review.
* Fix tests.
* Addressed the most critical issue.
* Make subscriber thread-safe.
* Revert "Make subscriber thread-safe."
This reverts commit 9a6a52197cfa8463ab60dfaae9530ad3c0ed8790.
* Fixed test failures. The only failure now is the asan failure.
* Reset test suites and see if it fixes the issue.
* Fix a flaky test
* Addressed code review.
2021-05-13 09:29:02 -07:00
Eric Liang
0dfd43c61b
Add nightly release test directory and add shuffle release test ( #15671 )
...
* update
* udpate
* update
* update
* update
* Adjust script/release test json
* remove
* update
* lint
Co-authored-by: Kai Fricke <kai@anyscale.com>
2021-05-08 14:21:55 -07:00
Kai Fricke
8db2e5c23a
[release] Move xgboost tune small + microbenchmark release test to new release automation ( #15619 )
2021-05-08 20:38:39 +01:00
Kai Fricke
1d52ab819f
[release] release 1.3.0 results and test updates ( #15366 )
...
Convert a number of release tests and add logs for release 1.3.0
2021-05-04 22:10:04 +01:00
Jenna Kwon
15da948214
Support object spilling mode and data load failure mode in dask_on_ra… ( #15601 )
...
* Support object spilling mode and data load failure mode in dask_on_ray_large_scale_test.py
* Remove freq and time decimation
Co-authored-by: Jenna Kwon <jkkwon@amazon.com>
2021-05-04 10:57:49 -07:00
Amog Kamsetty
ebc44c3d76
[CI] Upgrade flake8 to 3.9.1 ( #15527 )
...
* formatting
* format util
* format release
* format rllib/agents
* format rllib/env
* format rllib/execution
* format rllib/evaluation
* format rllib/examples
* format rllib/policy
* format rllib utils and tests
* format streaming
* more formatting
* update requirements files
* fix rllib type checking
* updates
* update
* fix circular import
* Update python/ray/tests/test_runtime_env.py
* noqa
2021-05-03 14:23:28 -07:00
SangBin Cho
df9329160e
[Tests] Dask on ray release test ( #15256 )
...
* done.
* Linting.
* Update readme
* Update.
* Fix issues.
2021-04-15 10:30:17 -07:00
SangBin Cho
d0e83c43ca
[Release Test] Modify parameter to reduce stress ( #15048 )
...
* Fix.
* Fix.
2021-04-14 18:27:20 -07:00
Richard Liaw
59bf3a7b22
ray[cluster] -> ray[default] ( #15251 )
2021-04-14 09:37:04 -07:00
Edward Oakes
0f9d1bb223
Serve failure release test fix ( #15276 )
...
This test is currently not tested in CI
2021-04-13 17:49:29 +01:00
Edward Oakes
e4ca337e16
[serve] Change remaining tests to use deployment API ( #15167 )
2021-04-08 08:15:38 -05:00
Richard Liaw
e72f6b0377
Fix ray[full] -> ray[cluster] #15112
...
Signed-off-by: Richard Liaw <rliaw@berkeley.edu>
2021-04-05 09:55:00 -07:00
Kai Fricke
b366500938
[tune] fix long running release test WIP ( #14866 )
...
- Use placement groups
- Introduce time between checks for failure testing
- Use gloo instead of nccl
2021-03-25 11:03:22 +01:00
Amog Kamsetty
233f174984
Update release instructions ( #14882 )
2021-03-24 12:41:50 -07:00
SangBin Cho
5f7ce293fe
[Test] Large scale dask on ray test ( #14340 )
...
* Add a test.
* Add a test.
* d
* Modify the release doc.
* Addressed code review.
2021-03-23 11:00:35 -07:00
Kai Fricke
7364a7a327
[tune] Move Optuna to ask(fixed_distributions) interface ( #14731 )
...
Adjusting to changes in Optuna 2.6.0. Old interface was marked as deprecated.
2021-03-22 12:25:37 +01:00