Kai Yang
81be461ba2
[Core] Limit starting workers with maximum_startup_concurrency per worker type ( #16214 )
2021-06-09 13:11:53 +08:00
Simon Mo
4c0069edc2
[CI] Fix determine_tests_to_run logic ( #16320 )
...
We recently had two master breakages due to determine_tests_to_run
script bug.
https://github.com/ray-project/ray/pull/16120
https://github.com/ray-project/ray/pull/15981
This PR fix both of them.
2021-06-08 19:59:03 -07:00
Kathryn Zhou
2394ab2d2e
Update versioning for tracing in Ray docs ( #16041 )
...
Co-authored-by: Kathryn Zhou <kathrynzhou@kathryns-mbp.lan>
2021-06-08 19:23:19 -07:00
Dmitri Gekhtman
41b2e569fb
[autoscaler] Don't rsync cluster state with local node provider ( #16281 )
2021-06-08 12:27:06 -07:00
Eric Liang
deda35fb4a
Batch the AddSpilledURLs RPC ( #16303 )
2021-06-08 12:10:35 -07:00
Alex Wu
ae1cb12221
Revert "[GCS] Bookkeeping normal task resources in GCS ( #16185 )" ( #16315 )
...
This reverts commit f2384a9743
.
2021-06-08 11:02:28 -07:00
fyrestone
4ca316a0f4
Move test_snapshot from test_dashboard.py to modules/snapshot/tests/test_snapshot.py ( #16306 )
...
Co-authored-by: 刘宝 <po.lb@antfin.com>
2021-06-08 10:26:03 -07:00
Amog Kamsetty
de4045703d
[SGD] Fix SGD Client CI ( #16301 )
2021-06-08 10:08:14 -07:00
Simon Mo
9afb6f1ada
Revert "[CI] macOS Build to buildkite ( #16135 )" ( #16312 )
...
This reverts commit 113556463d
.
2021-06-08 09:33:03 -07:00
Sven Mika
4b8dadccbd
[RLlib] Fix PR 16162: Having added sleep to _NextValueNotReady
causes TD3 tests to become flakey. ( #16309 )
2021-06-08 07:27:02 -07:00
Chong-Li
f2384a9743
[GCS] Bookkeeping normal task resources in GCS ( #16185 )
2021-06-08 19:58:15 +08:00
Clark Zinzow
ca68bf1e93
[Release] Update release test configs for 1.4 release. ( #16292 )
...
* Updated scalability envelope tests for 1.4.
* Update data processing release test for 1.4.
2021-06-08 00:15:25 -07:00
Lixin Wei
870a0c16a3
[Logging] Change std::exit to std::_Exit ( #16280 )
...
* change abort to exit
* change to std::_Exit
2021-06-08 00:14:17 -07:00
Chris K. W
c8e3ed9eec
[core] Use function_actor_manager.lock when deserializing ( #16278 )
...
* use function_actor_manager.lock when deserializing
* add comment and todo
* better comment
* fix comment
2021-06-08 00:13:42 -07:00
mwtian
c2a2a6f7c3
Make it easier to run asan and wheel release tests ( #16242 )
2021-06-07 22:54:22 -07:00
Simon Mo
113556463d
[CI] macOS Build to buildkite ( #16135 )
2021-06-07 21:33:00 -07:00
fyrestone
dfadf33a94
[Dashboard] Reorganize dashboard modules - node ( #16217 )
2021-06-07 19:50:46 -07:00
Travis Addair
7802ff66d4
[docker] Updated GPU Dockerfiles to CUDA 11.2 ( #16269 )
2021-06-07 16:15:19 -07:00
Alex Wu
6f5064b7ef
Use pytest not unittest ( #16265 )
...
* .
* done
* done
* .
* .
Co-authored-by: Alex Wu <alex@anyscale.com>
2021-06-07 12:26:56 -07:00
Alex Wu
9f8f108e3f
[deflek] Split test failure into test failure 4 ( #16264 )
...
* .
* .
Co-authored-by: Alex Wu <alex@anyscale.com>
2021-06-07 11:54:55 -07:00
Lixin Wei
75196cf7f4
[scheduler] Clean up TaskRequest ( #16288 )
2021-06-07 11:38:34 -07:00
Edward Oakes
418dd1e8b9
fix serve start namespace issue and add test ( #16291 )
2021-06-07 11:30:31 -07:00
Siyuan (Ryans) Zhuang
480e5e822e
Inital workflow API implementation ( #16174 )
2021-06-07 10:00:15 -07:00
SangBin Cho
f867c27eda
[Object spilling] Fix race condition that deletes files at the wrong timing. ( #16153 )
...
* Error fix.
* remove debug code
* Add unit test
* Fix a test failure
2021-06-07 09:56:55 -07:00
SangBin Cho
3572d0837e
[Test] Dask on ray sort nightly ( #16213 )
...
* Make dask on ray sort works
* lint
* revert unrelated change
2021-06-06 15:58:48 -07:00
SangBin Cho
03c33cf443
add a streaming shuffl etest ( #16258 )
2021-06-06 15:58:14 -07:00
Eric Liang
1d8cb2d19e
Add event stats documentation, fix misc race condition ( #16236 )
...
* update
* stats
* udpate
* fix
2021-06-06 12:44:30 -07:00
architkulkarni
b88163f010
[Core] [runtime env] Fix injection of ray[default] ( #16275 )
2021-06-05 17:32:50 -05:00
architkulkarni
b3a0b97737
Revert "[Core] [runtime env] Inject ray[default] into pip dependencies ( #16268 )" ( #16274 )
...
This reverts commit e5fad4bc2d
.
2021-06-05 21:26:19 +03:00
Eric Liang
ca861ee47f
update ( #16270 )
2021-06-05 11:16:01 -07:00
Dmitri Gekhtman
7d1e7a0d4f
[autoscaler] Fix local node provider ( #16202 )
...
* Don't override resources for local node provider.
* Wip
* Local node provider prep logic
* ../python/ray/autoscaler/local/defaults.yaml
* wip
* Fix example-full
* defaults comment
* wip
* head type max workers
* sync-state
* No docker
* Fix
* external head ip option
* wip
* move external_ip out of tags
* Update examples
* Update comment
* Skip local defaults
* Config test
* Test external ip
* Change ray start commands to what they were before
* missing yamls
* Fix test
* Remove scary Docker
* Fixes
* Extra test
* address comments
* fixes pre-single-node-type-attempy
* rewrite comment a bit
* One type
* fix
* get rid of pdb
* no placeholders
* fix
* worker nodes and head node optional during launch
* fix
* fix again
* config comment fixes
* mock -> aws, not local
* Update python/ray/autoscaler/_private/local/config.py
Co-authored-by: Ian Rodney <ian.rodney@gmail.com>
* second pop fixed
* Explanatory comments for config logic
* deprecation comments
* Update python/ray/autoscaler/_private/local/config.py
Co-authored-by: Ameer Haj Ali <ameerh@berkeley.edu>
* update test
* fix
* More descriptive name for local provider check
* Remove external-ip from example minimal and add a more detailed doc string.
* Make clearer the equivalence between a ray restart and non-empty ray-start commands
* extra comment
* Update python/ray/autoscaler/_private/local/node_provider.py
* Update python/ray/autoscaler/_private/commands.py
* Update python/ray/autoscaler/_private/commands.py
* Update python/ray/autoscaler/_private/util.py
* lint
* Update python/ray/autoscaler/_private/local/node_provider.py
Co-authored-by: Ameer Haj Ali <ameerh@berkeley.edu>
Co-authored-by: Ian Rodney <ian.rodney@gmail.com>
Co-authored-by: Ameer Haj Ali <ameerh@berkeley.edu>
2021-06-05 19:29:19 +03:00
Dmitri Gekhtman
e58ba66681
[gcp][doc][minor] project_id is required ( #16266 )
2021-06-05 01:00:11 -07:00
Chris K. W
2e11ac678f
[autoscaler] Additional Autoscaler Metrics ( #16198 )
2021-06-04 23:19:17 -07:00
architkulkarni
e5fad4bc2d
[Core] [runtime env] Inject ray[default] into pip dependencies ( #16268 )
...
Co-authored-by: Edward Oakes <ed.nmi.oakes@gmail.com>
2021-06-05 00:22:33 -05:00
architkulkarni
6c99267972
[Core] [runtime_env] add get_release_wheel_url() ( #16267 )
2021-06-04 22:00:17 -05:00
Stephanie Wang
dd73e8d31b
[core] Add object store debug information ( #16232 )
...
* debug
* todo
* periodic dump
* Build and debug
* x
* debug
* more debug
2021-06-04 19:42:00 -07:00
yncxcw
e13509075d
[Core] Make the the exit type explict for workers being killed TryKillingIdleWorkers ( #16211 )
2021-06-04 18:23:36 -07:00
Lixin Wei
59a2879216
[New Scheduler] Remove Useless Fields in Cluster Resource Data ( #16254 )
...
* non-tests done
* test modifed
2021-06-04 18:00:13 -07:00
Clark Zinzow
227f252c39
[Release] Release 1.4.0 stress tests, scalability envelope, and microbenchmark release logs ( #16228 )
2021-06-04 16:36:41 -07:00
Dmitri Gekhtman
a60ee3a8b2
[autoscaler][kubernetes][minor] latest images everywhere ( #16205 )
...
* latest images everywhere
* add back some documentation on the images
* Doc update
2021-06-04 16:01:39 -07:00
Eric Liang
527d51b83a
Allow configuring internal config with RAY_{name} env vars.
2021-06-04 15:37:31 -07:00
Eric Liang
472fe46a75
skip on win ( #16256 )
2021-06-04 15:36:55 -07:00
Lixin Wei
cf58cd76c7
[Logging] Disable Core Dumps in Fatal Logging ( #16189 )
2021-06-04 11:44:08 -07:00
Ian Rodney
ba14b1c538
skip test_delay_in_rewriting_environment ( #16255 )
2021-06-04 09:50:02 -07:00
Gerges Dib
f8cf4a1985
[RLlib] Fixed import tensorflow when module not available ( #16171 )
2021-06-04 10:07:59 +02:00
Simon Mo
d6b3050632
[Buildkite] Wheels and Docker fixup ( #16241 )
2021-06-04 00:48:12 -07:00
Ian Rodney
799af7d7c0
[client] Better Error Messages ( #16163 )
2021-06-04 00:32:21 -07:00
SongGuyang
331ea6b72d
[C++ worker] fix doc ( #16239 )
2021-06-04 13:41:47 +08:00
Yi Cheng
dea178caac
[core] Convert the log from exception to warning for setup function ( #16225 )
2021-06-03 23:53:29 -05:00
architkulkarni
6be5ec8f39
[Core] [runtime env] Fix test_get_master_wheel_url ( #16234 )
2021-06-03 23:09:43 -05:00