Commit graph

8778 commits

Author SHA1 Message Date
Scott Graham
3334357c58
[autoscaler] [azure] Fix Azure Autoscaling Failures (#16640)
Co-authored-by: Scott Graham <scgraham@microsoft.com>
2021-07-10 11:55:00 -07:00
SangBin Cho
33e319e9d7
[Tests] Remove app level error from nightly tests (#16968)
* Completed

* Fix tests

* increase the node wait timeout

Signed-off-by: SangBin Cho <rkooo567@gmail.com>
2021-07-09 12:20:42 -07:00
fyrestone
66ea099897
[Dashboard][event] Basic event module (#16698)
* Basic event module

* Fix comments

* Set the SCAN_EVENT_DIR_INTERVAL_SECONDS defaults to 2

* Fix lint

* Fix lint

* Clean code

* Try to fix flaky

* Fix test

* Disable event module by default

Co-authored-by: 刘宝 <po.lb@antfin.com>
2021-07-09 10:25:30 -07:00
Nikita Vemuri
6d36d7ed7e
[Serve] Call FastAPIWrapper class constructor before startup hooks (#16941)
* run constructor before startup hooks

* address comments

Co-authored-by: Nikita Vemuri <nikitavemuri@nikitas-mbp.attlocal.net>
2021-07-09 09:39:32 -07:00
Dmitri Gekhtman
27a9ae5e13
[autoscaler][gcp] Retry GCP BrokenPipeError (#16952) 2021-07-08 13:54:29 -07:00
Maxim Egorushkin
9cb5c9a422
Never convert trial_id to float when loading progress.csv. (#16959)
* Never convert trial_id to float when loading progress.csv.

* Formatting updated.

Co-authored-by: Maxim Egorushkin <maxim.egorushkin@gmail.com>
2021-07-08 11:06:11 -07:00
Sven Mika
7862dd64ea
[RLlib] Fix bug in policy.py: normalize_actions=True has to call unsquash_action, not normalize_action. (#16774) 2021-07-08 17:31:34 +02:00
Sven Mika
9f6a92163b
[RLlib] Remove old UsageTrackingDict code. (#16867) 2021-07-08 17:27:52 +02:00
SongGuyang
560fd15568
[C++ worker] support build and add C++ worker to python wheel (#16496) 2021-07-08 14:42:26 +08:00
matthewdeng
264e2df7e2
[release] update modin_xgboost_test to use anyscale connect (#16942) 2021-07-07 22:37:41 -07:00
Clark Zinzow
cc215353e2
[Datasets] Adds Dataset.iter_batches(). (#16853) 2021-07-07 22:01:20 -07:00
Frank Luan
7c0320175c
Actor fix (#16955) 2021-07-07 20:51:36 -07:00
Clark Zinzow
9358dd4bc2
[Datasets] Port JSON and CSV readers to datasource API. (#16938)
* Port JSON and CSV readers to datasource API.

* Formatting.

* Moved datasources to datasource dir, created shared FileBasedDatasource.

* Confirm that accessing dataset schema raises an error.

* Formatting.

* Return None for unknown metadata instead of raising an error.

* Feedback.
2021-07-07 20:32:04 -07:00
Chen Shen
ae6e5db927
Fix minor message in plasma (#16953) 2021-07-07 20:30:59 -07:00
Kai Yang
e925051ce4
[Core] Get node to connect for driver in global state accessor (#16810) 2021-07-08 11:21:12 +08:00
Amog Kamsetty
3c482cd6c8
Skip more test_deploy tests on OSX (#16943)
* skip more

* skip more
2021-07-07 16:53:21 -07:00
Simon Mo
f4671d55d8
Bump log monitor's sleep duration to 0.1s (#16939)
We observed in long running serving scenarios the log monitor
consistently uses 10% of cpus when there is no new lines. Hopefully
this new sleep duration should shrink that usage
2021-07-07 15:41:34 -07:00
Chen Shen
0421fa188e
[core] use fallocate for fallback allocation to avoid SIGBUS (#16824) 2021-07-07 14:50:11 -07:00
Dmitri Gekhtman
2f42b0c4b9
[kubernetes] K8s keep gpu zero override (#16887) 2021-07-07 13:45:34 -07:00
Chen Shen
dbd3260141
[core] Deprecate QuotaAwareEvictionPolicy (#16911) 2021-07-07 13:44:41 -07:00
Eric Liang
3b9f6ccc5e
Remove autoinit from ray.data (#16925) 2021-07-07 13:44:10 -07:00
Thomas J. Fan
ead03c5d29
[doc] Allows xgboost_ray documentation to be rendered (#16919) 2021-07-07 10:45:33 -07:00
Amog Kamsetty
b79ef3ba0f
[Serve] Skip more test_deploy tests on OSX (#16937) 2021-07-07 10:44:01 -07:00
Antoni Baum
8f41a34079
[tune] Placement group manager fixes (#16844)
Co-authored-by: Kai Fricke <kai@anyscale.com>
2021-07-07 10:42:19 -07:00
Antoni Baum
b737b2a877
[joblib] Improved object store management for Pool (#16879)
* Improved object store management for Pool

* Update docs, hints

* Add test

* Nit

* Nit
2021-07-07 10:39:18 -07:00
Amog Kamsetty
5c589debfa
Revert "Set runs_per_test_detect_flakes for core tests on master (#16863)" (#16936)
This reverts commit 44042519af.
2021-07-07 10:25:46 -07:00
Dmitri Gekhtman
c6497c6520
[client][test] Client multiprocessing tests + client api minor fix (#16904) 2021-07-07 09:47:27 -07:00
Tao Wang
34422ef53f
[Doc]Add statement for supporting remtoe redis (#16869)
* [Doc]Add statement for supporting remtoe redis

* Update doc/source/cluster/cloud.rst

Co-authored-by: Alex Wu <itswu.alex@gmail.com>

Co-authored-by: Alex Wu <itswu.alex@gmail.com>
2021-07-07 00:37:06 -07:00
Eric Liang
639a29437b
Add debug message to buffer check fail (#16924) 2021-07-06 23:37:51 -07:00
mwtian
18d126192e
Deprecate StringIdMap::Remove() (#16888) 2021-07-06 22:46:25 -07:00
SangBin Cho
33a2213c6f
Add another large scale shuffle test to verify stability (#16902) 2021-07-06 22:24:00 -07:00
Eric Liang
03f99100ea
Enable ray auto init by default (#16861) 2021-07-06 21:56:32 -07:00
Eric Liang
d956ca1b54
Add decision tree test to nightly builds (#16912) 2021-07-06 20:49:04 -07:00
Eric Liang
ca083e16d4
[dataset] Fix conversion to pyarrow tables in several transforms (#16916) 2021-07-06 20:40:57 -07:00
matthewdeng
23088bd7ea
[release] update torch_tune_serve_test to use anyscale connect (#16754)
* [release] update torch_tune_serve_test to use anyscale connect

* use download_results to download model checkpoint

* clean up code to support both OSS and Anyscale
2021-07-06 19:02:50 -07:00
Amog Kamsetty
7318a212fb
[Serve] Skip test_redeploy_multiple_replicas on OSX (#16915) 2021-07-06 18:58:36 -07:00
Eric Liang
5a94632f8d
Revert the gRPC-based resource broadcast due to check failures during cluster autoscaling (#16910) 2021-07-06 18:47:02 -07:00
Eric Liang
44042519af
Set runs_per_test_detect_flakes for core tests on master (#16863) 2021-07-06 18:46:48 -07:00
SangBin Cho
74c9fa0650
Update unclear named actor doc & namespace (#16903)
* Update the named actor documentation

* Update the doc

* Update doc/source/actors.rst

* Move note to the end of the block not the beginning

Co-authored-by: Alex Wu <itswu.alex@gmail.com>
2021-07-06 15:51:29 -07:00
Eric Liang
7e52fde8a3
Fix num returns error message (#16865) 2021-07-06 14:57:26 -07:00
Richard Liaw
38496b1765
Allow feature requests in Github Issues (#16892)
Signed-off-by: Richard Liaw <rliaw@berkeley.edu>
2021-07-06 14:12:23 -07:00
Amog Kamsetty
503b748d64
Revert "Revert "[Java] upgrade jar deps to fix cves" (#16889)" (#16899)
This reverts commit f2308a0cdf.
2021-07-06 14:00:50 -07:00
Amog Kamsetty
39d60f62d2
[hotfix] fix material-ui version once more (#16901) 2021-07-06 13:57:34 -07:00
Kai Fricke
10fd7111b3
[rllib] Improve test learning check, fix flaky two step qmix (#16843) 2021-07-06 19:39:12 +01:00
Simon Mo
b11b35aa45
hotfix material-ui version again (#16897) 2021-07-06 11:08:57 -07:00
Amog Kamsetty
d5ac5c45ea
[Dashboard] Pin material-ui/lab dependency (#16890) 2021-07-06 10:49:10 -07:00
Amog Kamsetty
f2308a0cdf
Revert "[Java] upgrade jar deps to fix cves" (#16889)
This reverts commit 25666fff81.
2021-07-06 10:33:31 -07:00
Stefan Schneider
d4babd69c1
[windows] correct symlinks for files (node.py) (#16817) 2021-07-06 10:01:13 -07:00
Amog Kamsetty
ecb632140f
Revert "RockPaperScissors Pettingzoo" (#16886)
This reverts commit bf3e3225b6.
2021-07-06 09:43:47 -07:00
Dmitri Gekhtman
a27a8172cc
[autoscaler] Handle node type key change/deletion (#16691) 2021-07-06 09:06:58 -07:00