Amog Kamsetty
3a52187da8
[Release/Lightning] Add Ray lightning user test ( #19812 )
...
* wip
* wip
* add ray lightning test
* fix
* update
* merge and add
* fix
* fix
* rename
* autoscale
* add tblib
* gloo backend
* typo
* upgrade torch
* latest and master
2021-11-01 18:29:48 -07:00
Amog Kamsetty
474e44f7e0
[Release/Horovod] Add user test for Horovod ( #19661 )
...
* infra
* wip
* add test
* typo
* typo
* update
* rename
* fix
* full path
* formatting
* reorder
* update
* update
* Update release/horovod_tests/workloads/horovod_user_test.py
Co-authored-by: matthewdeng <matthew.j.deng@gmail.com>
* bump num_workers
* update installs
* try
* add pip_packages
* min_workers
* fix
* bump pg timeout
* Fix symlink
* fix
* fix
* cmake
* fix
* pin filelock
* final
* update
* fix
* Update release/horovod_tests/workloads/horovod_user_test.py
* fix
* fix
* separate compute template
* test latest and master
Co-authored-by: matthewdeng <matthew.j.deng@gmail.com>
2021-11-01 18:28:07 -07:00
matthewdeng
e1e4a45b8d
[train] add simple Ray Train release tests ( #19817 )
...
* [train] add simple Ray Train release tests
* simplify tests
* update
* driver requirements
* move to test
* remove connect
* fix
* fix
* fix torch
* gpu
* add assert
* remove assert
* use gloo backend
* fix
* finish
Co-authored-by: Amog Kamsetty <amogkamsetty@yahoo.com>
2021-11-01 18:25:19 -07:00
xwjiang2010
1803ca13b6
Adding release logs for 1.8.0. ( #19867 )
2021-11-01 10:26:04 -07:00
architkulkarni
702bffe072
[runtime env] [test] Enable runtime env nightly test with working_dir reconnection ( #19906 )
2021-10-31 10:48:48 -05:00
xwjiang2010
4d293c4cee
Increase horovod_test disk space. ( #19917 )
2021-10-30 14:41:31 -07:00
Lixin Wei
1fe9f3372e
[Nightly Test] Remove duplicate printing code ( #19874 )
...
## Why are these changes needed?
Remove duplicate printing code
2021-10-29 10:19:19 -07:00
Kai Fricke
fa0158abe5
[tune] Cloud checkpointing release tests ( #19638 )
2021-10-29 12:12:01 +02:00
Kai Fricke
a13f738a10
[ci/release] Fix cloud search query ( #19876 )
2021-10-29 11:30:34 +02:00
Kai Fricke
564d8551ed
[ci/release] only check alert if test succeeded before ( #19857 )
2021-10-28 16:09:10 -07:00
Simon Mo
3e038aebb2
[CI] Allow release tests infra to accept buildkite artifacts ( #19803 )
2021-10-27 13:04:01 -07:00
Yi Cheng
abec07700a
[nightly] Adding more tests related to grpc broadcasting to staging mode ( #19779 )
...
## Why are these changes needed?
We have concern that grpc based broadcasting might have negative impact on pg related workload. This test is to ensure it's running well before merging.
## Related issue number
#19438
2021-10-27 10:46:13 -07:00
Jiao
3f628d4f6b
increase long poll timeout and wrk trial cpu resource ( #19768 )
2021-10-26 21:31:39 -07:00
SangBin Cho
bcd27b708f
[Test] Mark many ppo as unstable ( #19769 )
2021-10-26 21:27:43 -07:00
xwjiang2010
ab15dfd478
[Tune release test] Set 500G disk space for rllib_tests. ( #19730 )
2021-10-26 10:12:03 -07:00
Jiao
aaef82920d
[serve] Add periodic timeouts to long poll client to avoid accumulating concurrent tasks in the controller ( #19728 )
2021-10-26 09:44:00 -05:00
Kai Fricke
98244ad130
[ci/release] Report error to database on alert ( #19743 )
2021-10-26 10:48:02 +01:00
Kai Fricke
96ddf5b9ac
[ci/release] Choose cloud by name or ID ( #19742 )
2021-10-26 10:21:54 +01:00
Amog Kamsetty
6e61ca623d
[CI] Infra for "user" tests ( #19662 )
2021-10-26 08:47:22 +01:00
SangBin Cho
ecd5a622ef
[Tests] Add a memory usage on dask on ray tests ( #19674 )
2021-10-25 14:58:26 -07:00
architkulkarni
414910b7fc
[test] [runtime env] Add release test with Ray Client and local pip files ( #19026 )
2021-10-25 11:49:27 -05:00
xwjiang2010
a632cb439f
[Tune] Remove queue_trials. ( #19472 )
2021-10-22 09:24:54 +01:00
SangBin Cho
9000f41aa6
[Nightly Test] Support memory profiling on Ray + implement memory monitor for nightly tests ( #19539 )
...
* random fixes
* Done
* done
* update the doc
* doc lint fix
* .
* .
2021-10-21 07:37:05 -07:00
Yi Cheng
7a7b356899
[Nightly test] add test for grpc broadcasting ( #19579 )
2021-10-21 07:01:41 -07:00
Kai Fricke
71564040ec
[ci/release] Unwrap after installing pip packages ( #19552 )
2021-10-20 13:41:16 +01:00
Yi Cheng
01b899dafb
[nightly] Fix broken test due to bad syntax #19536 ( #19536 )
2021-10-19 21:43:46 -07:00
Yi Cheng
7a9cedfc5c
[nightly] Add grpc based broadcasting into nightly test for decision_tree ( #19531 )
...
* dbg
* up
* check
* up
* up
* put grpc based one into nightly test
* up
2021-10-19 19:59:39 -07:00
Kai Fricke
3e8587644b
[ci/release] wrap all release test pip github installs in quotation marks ( #19521 )
2021-10-19 20:55:02 +01:00
Chen Shen
b38ebd368c
[Dataset][nighlyt-test] spend less money #19488
...
Reduce the epoch and ensure everything runs in the same datacenter.
2021-10-18 18:53:50 -07:00
gjoliver
e9f66cc394
Reduce success criteria for a few learning tests. ( #19484 )
2021-10-18 15:44:38 -07:00
Jiajun Yao
4d9585773f
[Release] Remove release process doc ( #19312 )
2021-10-18 11:24:03 -07:00
Yi Cheng
f47f69d31e
[nightly] Add decision_tree_autoscaling_20_runs to nightly test
2021-10-18 11:19:40 -07:00
Kai Fricke
ad94eb03c6
[ci/release] wrap pip github installs in quotation marks to prevent comment errors ( #19464 )
2021-10-18 18:55:56 +01:00
Kai Fricke
eee05505b1
[ci/release] Add separate timeout parameter for prepare commands ( #19459 )
2021-10-18 16:29:25 +01:00
Kai Fricke
57fe405120
[ci/release] Bump long running release test timeouts to 6 minutes ( #19458 )
2021-10-18 16:27:53 +01:00
Chen Shen
9dba5e0ead
[dataset][nightly-test] fix pipeline ingest test ( #19437 )
2021-10-18 11:31:24 +01:00
Kai Fricke
6c6639a0d7
[ci/release] hotfix for undefined local variable ( #19460 )
2021-10-18 11:28:33 +01:00
matthewdeng
caa42d753c
[release] pin modin>=0.11.0 due to ray.services being removed ( #19446 )
2021-10-18 11:23:05 +01:00
Kai Fricke
c10d434713
[release] Allow commit hashes instead of URLs, add bisection utility ( #19398 )
2021-10-18 10:44:29 +01:00
Kai Fricke
e17b23fa5b
[ci/release] Add support for RAY_WHEELS url ( #19364 )
2021-10-14 21:40:01 +01:00
Kai Fricke
e07d0953ea
[ci/release] Undo faulty change to many_ppo num_samples ( #19388 )
2021-10-14 10:27:31 -07:00
Antoni Baum
e9df253f5d
[CI/docs] Remove [default] from xgboost-ray ( #19186 )
...
Co-authored-by: Kai Fricke <kai@anyscale.com>
2021-10-14 16:29:55 +01:00
Kai Fricke
9cee83c919
[tune] PBT: Add burn-in period ( #19321 )
2021-10-14 16:28:29 +01:00
Carlo Grisetti
5cee8a1985
[release tests] Switch from yaml.load to yaml.safe_load ( #19365 )
2021-10-13 17:27:25 -07:00
Yi Cheng
1dc03cd49d
[nightly] Put many nodes actor test back ( #19313 )
...
## Why are these changes needed?
There are two issues fixed in this PR:
- make sure wait for session count alive node
- upgrade the machine to match what's tested in oss ray.
## Related issue number
https://github.com/ray-project/ray/issues/19084
2021-10-13 15:51:12 -07:00
matthewdeng
d998373968
[release] fix test by pinning filelock ( #19334 )
...
Co-authored-by: Kai Fricke <kai@anyscale.com>
2021-10-13 22:27:04 +01:00
Jiao
893f76daf9
[serve] Add serve FT nightly test to buildkite ( #19361 )
2021-10-13 13:56:55 -07:00
Jiao
85b8a6de5f
[Serve] Add nightly test for Serve failure recovery ( #19125 )
2021-10-11 18:33:20 -07:00
SangBin Cho
dd1c1f9787
[Nightly test] remove env vars from tests ( #19221 )
...
When testing it we should minimize unnecessary env vars (and it's better working with the default config). This PR removes unnecessary env vars that are set.
2021-10-08 06:53:23 -07:00
Clark Zinzow
ca731d7c86
[Datasets] Fix API breakage in Datasets nightly test.
2021-10-07 15:07:19 -07:00