Kai Fricke
8804758409
[xgboost] Add XGBoost release tests ( #13456 )
...
* Add XGBoost release tests
* Add more xgboost release tests
* Use failure state manager
* Add release test documentation
* Fix wording
* Automate fault tolerance tests
2021-01-20 18:40:23 +01:00
Simon Mo
c963cbc038
Fix Docker Permission for Serve release test again ( #13543 )
2021-01-19 12:23:30 -08:00
Sven Mika
93c0a5549b
[RLlib] Deprecate vf_share_layers
in top-level PPO/MAML/MB-MPO configs. ( #13397 )
2021-01-19 09:51:35 +01:00
SangBin Cho
1179db1fc2
Remove an unnecessary file ( #13499 )
2021-01-15 18:29:12 -08:00
Eric Liang
ee6332dbb0
Bump dev branch to 2.0 to avoid endless version bump toil ( #13497 )
...
* wip
* fix
* fix
2021-01-15 17:41:17 -08:00
SangBin Cho
d09df55b14
Update ID specification doc ( #13356 )
2021-01-15 15:15:51 -08:00
Simon Mo
16e8c4a69f
[Release] Fix Serve release test ( #13303 )
...
The Docker image we were using now uses `ray` users so we have to call
sudo.
2021-01-14 12:23:53 -08:00
SangBin Cho
0428537d0b
[Object Spilling] Long running object spilling test ( #13331 )
...
* done.
* formatting.
2021-01-12 16:53:13 -08:00
Kai Fricke
518427627b
[tune] buffer trainable results ( #13236 )
...
* Working prototype
* Pass buffer length, fix tests
* Don't buffer per default
* Dispatch and process save in one go, added tests
* Fix tests
* Pass adaptive seconds to train_buffered, stop result processing after STOP decision
* Fix tests, add release test
* Update tests
* Added detailed logs for slow operations
* Update python/ray/tune/trial_runner.py
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
* Apply suggestions from code review
* Revert tests and go back to old tuning loop
* nit
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
2021-01-12 18:52:47 +01:00
Simon Mo
c32ad2fef5
[Release] Use ray-ml image for logn running test ( #13267 )
2021-01-07 10:31:46 -08:00
Max Fitton
5094734205
Update autoscaler-cluster yaml files for release tests ( #13114 )
2021-01-07 11:44:57 -06:00
Simon Mo
01dcb993c7
[Serve] Rescale Serve's Long Running Test to Cluster Mode ( #13247 )
...
Now that `HeadOnly` becomes the new default HTTP location, we can
re-enable the long running tests to use local multi-clusters.
(also fixed the controller's API to match up to date, we should
have caught these, I will open issues for this.)
2021-01-07 08:57:24 -08:00
Max Fitton
0d61ea9b06
[Release] Add 1.1.0 release test logs ( #13054 )
...
* Add microbenchmark to release logs
* check in many_tasks stress test result
* Add results of placement group stress test for 1.1.0
* Add result for test_dead_actors test and correct the name of test_many_tasks.txt
* Add rllib regression test result
* Add pytorch test results for rllib
* remove extraneous log entries
2021-01-06 11:03:16 -08:00
Max Fitton
d018212db5
[Release] Update Release Process Documentation ( #13123 )
2021-01-04 11:09:43 -08:00
Alex Wu
a79c9fcac3
[release tests] test_many_tasks fix ( #12984 )
2020-12-22 11:05:33 -08:00
Max Fitton
e077bc4206
[Release] Bump master to 1.2.0 for 1.1.0 release ( #12856 )
2020-12-15 09:40:26 -08:00
Simon Mo
3d8c1cbae6
[Serve] Fix Serve Release Tests ( #12777 )
2020-12-11 11:53:47 -08:00
Eric Squires
9f70293700
Remove debug extras from setup.py ( #12751 )
2020-12-10 16:23:11 -06:00
Kai Fricke
df10b84113
[Release] release tests yamls for Tune & GPU ( #12496 )
2020-12-08 10:15:07 -08:00
SangBin Cho
3ee4612696
[Release] Fix cluster.yaml ( #12589 )
...
* Fix cluster.yaml
* Updated to use manylinux2014
2020-12-07 13:52:30 -08:00
Richard Liaw
da42bf29d0
[tune] horovod release test ( #12495 )
2020-12-02 12:04:54 -08:00
Eric Liang
9f322db71d
Add many_ppo long running test ( #12364 )
...
* add new tes
* update
* update
2020-11-24 16:00:33 -08:00
Sven Mika
4afaa46028
[RLlib] Increase the scope of RLlib's regression tests. ( #12200 )
2020-11-24 22:18:31 +01:00
Edward Oakes
32d159a2ed
Fix release directory & RELEASE_PROCESS.md ( #12269 )
2020-11-23 14:28:59 -06:00
Simon Mo
5df9f07ff3
[CI] Use Docker image for microbenchmarks ( #12189 )
...
* [CI] Use Docker image for microbenchmarks
* Update cluster.yaml
2020-11-19 17:54:40 -08:00
Edward Oakes
2feba4409c
[serve] Fix long running failure test ( #11805 )
2020-11-09 11:21:03 -06:00
Barak Michener
05c4e3fb2a
[build] Build wheels with manylinux2014 ( #11621 )
...
* necessary changes
* Split bazel install
* manylinux2014
* change references to manylinux2014
* Fix lint
* port alex's docker build changes
* fix config issue
* remove extra manylinux2010 requirement script
* revert SHA overwrite
* wip
* incompatible_linklibs
* fix nits
2020-11-03 19:36:32 -08:00
Barak Michener
4348ecf850
Clean up release tests ( #11420 )
2020-10-22 17:04:41 -07:00