ray/python
Stephanie Wang 162cc9e6bd
Add chaos test for shuffle (#20657)
Adds a working failure test for streaming and non-streaming shuffle, without lineage reconstruction. This does a few things.

Test improvements:
- modifies AutoscalingCluster to allow passing an idle node timeout (the default is very low)
- some small improvements to the NodeKiller actor to hopefully improve flakiness.

Shuffle fixes:
- modifies shuffle tracker to wait on futures instead of having tasks signal. During failures, tasks may never signal the tracker, so we can't rely on these to track progress.

Core fixes:
- raylet will exit immediately if it receives the Shutdown RPC with graceful=False - there was a bug here where it's supposed to exit after replying to the client, but the gRPC server goes down for an unknown reason and the client reply is never sent
- On reference deletion, the owner now publishes an additional message to subscribers that the object has been deleted. Previously, this was causing a hang in streaming shuffle because the raylets pulling an object subscribed after the object was already deleted, so they never received the error signal.
2021-11-30 15:24:09 -08:00
..
ray Add chaos test for shuffle (#20657) 2021-11-30 15:24:09 -08:00
requirements [train] update Trainer._is_tune_enabled to work when Tune is not installed (#20767) 2021-11-29 20:08:51 -08:00
asv.conf.json [docs] Move all /latest links to /master (#11897) 2020-11-10 10:53:28 -08:00
build-wheel-macos-arm64.sh [CI] [macOS] avoid installing latest setuptools (#20064) 2021-11-04 21:35:03 -07:00
build-wheel-macos.sh [CI] [macOS] avoid installing latest setuptools (#20064) 2021-11-04 21:35:03 -07:00
build-wheel-manylinux2014.sh [dashboard] Rename "new_dashboard" -> "dashboard" (#18214) 2021-09-15 11:17:15 -05:00
build-wheel-windows.sh [Dashboard] Include the dashboard in Windows wheels (#19575) 2021-10-22 17:57:36 -07:00
MANIFEST.in [Build] Another attempt at building Python 3.9 MacOS wheels (#16347) 2021-06-10 10:20:30 -07:00
README-building-wheels.md [build] Build wheels with manylinux2014 (#11621) 2020-11-03 19:36:32 -08:00
requirements.txt Add smart_open dependency to ray[default] (#20420) 2021-11-18 10:00:30 -06:00
requirements_linters.txt [Lint] Add flake8-bugbear (#19053) 2021-10-03 23:24:11 -07:00
requirements_ml_docker.txt [Deps] Bump tensorflow on Docker image and add Codeowners (#20041) 2021-11-05 00:58:34 -07:00
setup.py Pin redis back to redis >= 3.5.0 (#20661) 2021-11-23 15:51:20 -08:00