ray/python at b6c42f96beab3ee00fe4b246e5e9d0479ad379ca - hiro/ray

hiro/ray

mirror of https://github.com/vale981/ray synced 2025-03-06 18:41:40 -05:00

History

Eric Liang b6c42f96be Auto-scale ray clusters based on GCS load metrics (#1348 ) This adds (experimental) auto-scaling support for Ray clusters based on GCS load metrics. The auto-scaling algorithm is as follows: Based on current (instantaneous) load information, we compute the approximate number of "used workers". This is based on the bottleneck resource, e.g. if 8/8 GPUs are used in a 8-node cluster but all the CPUs are idle, the number of used nodes is still counted as 8. This number can also be fractional. We scale that number by 1 / target_utilization_fraction and round up to determine the target cluster size (subject to the max_workers constraint). The autoscaler control loop takes care of launching new nodes until the target cluster size is met. When a node is idle for more than idle_timeout_minutes, we remove it from the cluster if that would not drop the cluster size below min_workers. Note that we'll need to update the wheel in the example yaml file after this PR is merged.		2017-12-31 14:39:57 -08:00
..
ray	Auto-scale ray clusters based on GCS load metrics (#1348 )	2017-12-31 14:39:57 -08:00
build-wheel-macos.sh	Make travis runs less verbose. (#1145 )	2017-10-19 22:25:56 -07:00
build-wheel-manylinux1.sh	Make travis runs less verbose. (#1145 )	2017-10-19 22:25:56 -07:00
README-building-wheels.md	Add script for building MacOS wheels. (#601 )	2017-06-01 00:30:46 +00:00
setup.py	EC2 cluster setup scripts and initial version of auto-scaler (#1311 )	2017-12-15 23:56:39 -08:00