ray/test at 1e0dfca2dc4e1c91cc82bc7bc9d44c88916c417d - hiro/ray

hiro/ray

mirror of https://github.com/vale981/ray synced 2025-03-10 05:16:49 -04:00

History

Eric Liang b6c42f96be Auto-scale ray clusters based on GCS load metrics (#1348 ) This adds (experimental) auto-scaling support for Ray clusters based on GCS load metrics. The auto-scaling algorithm is as follows: Based on current (instantaneous) load information, we compute the approximate number of "used workers". This is based on the bottleneck resource, e.g. if 8/8 GPUs are used in a 8-node cluster but all the CPUs are idle, the number of used nodes is still counted as 8. This number can also be fractional. We scale that number by 1 / target_utilization_fraction and round up to determine the target cluster size (subject to the max_workers constraint). The autoscaler control loop takes care of launching new nodes until the target cluster size is met. When a node is idle for more than idle_timeout_minutes, we remove it from the cluster if that would not drop the cluster size below min_workers. Note that we'll need to update the wheel in the example yaml file after this PR is merged.		2017-12-31 14:39:57 -08:00
..
jenkins_tests	[rllib] Pull out multi-gpu optimizer as a generic class (#1313 )	2017-12-17 15:59:57 -08:00
travis-ci	Migrate repositories to ray-project. (#438 )	2016-09-17 00:52:05 -07:00
actor_test.py	Allow actor methods to return multiple object IDs. (#1296 )	2017-12-09 10:37:57 -08:00
array_test.py	Switch Python indentation from 2 spaces to 4 spaces. (#726 )	2017-07-13 21:53:57 +00:00
autoscaler_test.py	Auto-scale ray clusters based on GCS load metrics (#1348 )	2017-12-31 14:39:57 -08:00
component_failures_test.py	Local scheduler sends a null heartbeat to global scheduler (#962 )	2017-09-12 10:45:21 -07:00
cython_test.py	Add basic functionality for Cython functions and actors (#1193 )	2017-11-09 17:49:06 -08:00
dataframe.py	Add a distributed Dataframe API to Ray (#1330 )	2017-12-20 09:31:22 -08:00
failure_test.py	Give error if a worker has a version mismatch for Python Ray, or clou… (#1245 )	2017-11-23 23:31:03 -08:00
microbenchmarks.py	Switch Python indentation from 2 spaces to 4 spaces. (#726 )	2017-07-13 21:53:57 +00:00
monitor_test.py	Make Monitor remove dead Redis entries from exiting drivers. (#994 )	2017-09-26 00:11:38 -07:00
multi_node_test.py	Allow scheduling with arbitrary user-defined resource labels. (#1236 )	2017-12-01 11:41:40 -08:00
recursion_test.py	Switch Python indentation from 2 spaces to 4 spaces. (#726 )	2017-07-13 21:53:57 +00:00
runtest.py	Register Common.error with local scheduler extension module. (#1316 )	2017-12-13 11:55:54 -08:00
stress_tests.py	Fixing Lint after flake upgrade (#1162 )	2017-10-26 21:02:07 -05:00
tensorflow_test.py	Switch Python indentation from 2 spaces to 4 spaces. (#726 )	2017-07-13 21:53:57 +00:00
trial_runner_test.py	[tune] Clean up result logging: move out of /tmp, add timestamp (#1297 )	2017-12-15 14:19:08 -08:00
trial_scheduler_test.py	[tune] Support user-defined trainable functions / classes / envs with a shared object registry (#1226 )	2017-11-20 17:52:43 -08:00