Commit graph

229 commits

Author SHA1 Message Date
Jiajun Yao
b8ef4f0a34
[CI] Add a retry helper to e2e.py (#19045) 2021-10-02 09:54:41 -07:00
Dmitri Gekhtman
bfd706aea3
[test][k8s] Restore kubernetes test directory, adds some info (#18982) 2021-10-01 11:23:22 +01:00
SangBin Cho
55227a15b9
Handle retry to avoid statement timeout exception/ (#18968) 2021-09-29 23:04:35 -07:00
Yi Cheng
a993f3a262
[nightly] update nightly test for many node test 2021-09-29 17:28:44 -07:00
Dmitri Gekhtman
944309c017
Revert "[nightly] Deflaky nightly test many_nodes_actor_test (#18582)" (#18954)
* Revert "[nightly] Deflaky nightly test many_nodes_actor_test (#18582)"

This reverts commit fc6a739e4b.

* move to large test

Co-authored-by: Yi Cheng <chengyidna@gmail.com>
2021-09-29 11:02:14 -04:00
Jiajun Yao
35774fd399
[CI] Print out the mismatched commit in ci (#18956) 2021-09-29 15:48:57 +01:00
Chen Shen
62a73f4ce8
[nightly test][event] enable event logs in nightly tests (#18936) 2021-09-28 01:29:26 -07:00
Jiajun Yao
18bdde1918
Install the test wheel last (#18881) 2021-09-24 20:56:53 +01:00
Guyang Song
337005d5a5
[C++ API][hotfix] fix C++ worker dynamic library loading issue on macOS (#18877)
* fix C++ worker in macox

* fix
2021-09-24 23:39:00 +08:00
Kai Fricke
e08d4253cf
[ci/release] Start cluster before connecting via anyscale connect (#18878) 2021-09-24 16:17:06 +01:00
Kai Fricke
d52203ee03
[ci/release] Fix long running serve test result fetching (#18880) 2021-09-24 16:16:01 +01:00
Chen Shen
7c99aae033
[dataset][nightly-test] add pipelined ingestion/training nightly test 2021-09-23 20:39:03 -07:00
Jiajun Yao
cc84f18176
Increase disk for long running distributed tests (#18855) 2021-09-23 17:52:35 +01:00
Guyang Song
237a2ade76
[wheel][cpp] recover cpp extra (#18597) 2021-09-23 12:10:03 +08:00
Sven Mika
5611150b1a
Increase rllib stress tests timeout for smoke test (#18810) 2021-09-22 14:30:42 +01:00
Kai Fricke
2cbf326410
[ci/release] store buildkite artifacts on buildkite (#18712) 2021-09-22 11:35:59 +01:00
Yi Cheng
fc6a739e4b
[nightly] Deflaky nightly test many_nodes_actor_test (#18582) 2021-09-20 22:43:48 -07:00
gjoliver
5b6d69d61a
Minor change to switch result checking order so there is no artificial delay. (#18555)
Co-authored-by: Jun Gong <jungong@mbpro.local>
2021-09-20 22:49:17 +01:00
Sven Mika
e6aae61487
[RLlib; testing] Fix bug in stress tests not handling >1 trials per experiment (due to grid-search in IMPALA stress tests). (#18705) 2021-09-20 15:31:57 +02:00
xwjiang2010
09e760a1fd
[Release] Change all cpus_per_actor in xgboost test. (#18717) 2021-09-17 12:57:21 -07:00
xwjiang2010
2c92f737f9
Fix dask_xgboost_test (#18713) 2021-09-17 11:25:54 -07:00
Jiao
ca3be60291
[Releaes] change headnode type for serve benchmark (#18672)
Co-authored-by: Jiao Dong <jiaodong@anyscale.com>
2021-09-16 10:57:36 -07:00
Sven Mika
ba1c489b79
[RLlib Testing] Lower --smoke-test "time_total_s" to make sure it doesn't time out. (#18670) 2021-09-16 18:22:23 +02:00
gjoliver
df32ed35fd
Extend --smoke-test deadlines for learning and stress regression tests. (#18667) 2021-09-16 09:18:39 +01:00
Antoni Baum
7e95f330d5
[ci] Fix xgboost_ray install from git (#18640) 2021-09-15 18:07:15 +01:00
Antoni Baum
eeb67a42cc
pip install xgboost_ray -> xgboost_ray[default] (#18607)
Co-authored-by: Kai Fricke <kai@anyscale.com>
2021-09-15 14:45:56 +01:00
Kai Fricke
15a83d104d
[ci/release] remove legacy release tests (#18592) 2021-09-15 14:42:58 +01:00
SangBin Cho
b8c361d3fb
[Test] Mark app config failure as a infra failure (#18614) 2021-09-14 17:20:05 -07:00
Kai Fricke
c8188ea70e
[ci/rllib] wait for stress test cluster (#18603) 2021-09-14 19:01:22 +01:00
Kai Fricke
6777e24293
[ci] Add release test owner overview file (#18590) 2021-09-14 11:00:31 -07:00
Sven Mika
08c09737fa
[RLlib] Fix R2D2 (torch) multi-GPU issue. (#18550) 2021-09-14 19:58:10 +02:00
SangBin Cho
51d94ebee0
[Tests] Make nightly test work + Remove work stealing logs (#18300)
* make tests work

* .
2021-09-14 09:52:58 -07:00
Antoni Baum
65d5deae60
[tests] Increase golden notebook test timeout to 20 mins (#18554) 2021-09-14 16:27:56 +01:00
Jiao
d3734d803d
[serve] Change nightly test docker image and enable micro benchmark (#18566) 2021-09-14 09:41:21 -05:00
Kai Fricke
e4754f1e19
[ci] wheel URLs - give some time for wheels to be built (#18505) 2021-09-14 09:56:34 +01:00
Guyang Song
beff857cc1
[release][C++ API] support sanity check C++ (#18545) 2021-09-14 11:39:08 +08:00
gjoliver
2924afa41e
[Release] Create soft links for libcusolver.so.10 as a temporary fix. (#18562)
Co-authored-by: Jun Gong <jungong@anyscale.com>
2021-09-13 14:37:12 -07:00
Jiajun Yao
ec6f5ae9ab
Upgrade serve_tests and runtime_env_tests base image to 1.6.0 (#18563) 2021-09-13 12:47:06 -07:00
Kai Fricke
b543c0e923
[ci] Do not use anyscale connect for xgboost_tests/train_small (#18569) 2021-09-13 20:38:00 +01:00
Kai Fricke
b6392aa6ea
[ci] upgrade microbenchmark base image to 1.6.0 (#18542) 2021-09-13 17:13:01 +01:00
Kai Fricke
7d1e6d3129
[ci/release] Add sanity check for ray wheels hash to release tests (#18489) 2021-09-10 17:50:31 +01:00
Kai Fricke
be438fb600
[release] Also download Ray CPP wheels (#18383) 2021-09-10 17:49:37 +01:00
SangBin Cho
7b2ed4c1f8
[Placement group] Placement group scheduling hangs due to creation/removal race condition (#18419) 2021-09-09 20:39:01 -07:00
matthewdeng
e66f154b14
[release] increase torch_tune_serve timeout to 20 min (#18481) 2021-09-09 16:31:14 -07:00
Simon Mo
6d24214085
[Release] Make sure to uninstall ray for rllib_tests (#18448) 2021-09-08 23:29:40 +01:00
gjoliver
50cdf551ce
[RLlib] Fix test name typo. (#18423)
Co-authored-by: Jun Gong <jungong@mbpro.local>
2021-09-08 23:30:37 +02:00
Yi Cheng
6011d4197f
Open [nightly] Add many_nodes_actor_test to nightly test (#18406) 2021-09-08 11:15:48 -07:00
Sven Mika
cabaa3b3c6
[RLlib Testing] Add A3C/APPO/BC/DDPPO/MARWIL/CQL/ES/ARS/TD3 to weekly learning tests. (#18381) 2021-09-07 11:48:41 +02:00
Sven Mika
5292b70fc6
[RLlib] Add multi-GPU attention net tests to nightly test suite (+ R2D2 tests for LSTM and attention nets). (#18368) 2021-09-06 17:48:05 +02:00
Kai Fricke
d9552e6795
Update release process doc and checklist (#18336)
Co-authored-by: Qing Wang <kingchin1218@126.com>
2021-09-06 14:09:31 +01:00