mirror of
https://github.com/vale981/ray
synced 2025-03-12 14:16:39 -04:00

Update autoscaler configuration docs for VM stack. Removed the video, after looking at it it fits better in overview / and is possibly outdated Co-authored-by: Eric Liang <ekhliang@gmail.com>
51 lines
2.9 KiB
ReStructuredText
51 lines
2.9 KiB
ReStructuredText
.. include:: /_includes/clusters/we_are_hiring.rst
|
|
|
|
.. _deployment-guide-autoscaler-under-construction:
|
|
|
|
Configuring Autoscaling
|
|
=======================
|
|
|
|
This guide explains how to configure the Ray autoscaler. The Ray autoscaler adjusts
|
|
the number of nodes in the cluster based on the resources required by tasks, actors or
|
|
placement groups.
|
|
|
|
Note that the autoscaler only considers logical resource requests for scaling (i.e., those specified
|
|
in ``@ray.remote`` and displayed in ``ray status``), not machine utilization.
|
|
If a user tries to launch an actor, task, or placement group but there are insufficient resources, the request will be queued.
|
|
The autoscaler adds nodes to satisfy resource demands in the queue, and removes nodes when they become idle.
|
|
|
|
Parameters
|
|
==========
|
|
|
|
The following are the autoscaling parameters that are specified with cluster launch. They can also be modified at runtime by
|
|
updating the cluster config.
|
|
|
|
**max_workers[default_value=2, min_value=0]**: Specify the max number of cluster worker nodes. Note that this excludes the head node.
|
|
|
|
.. note::
|
|
|
|
If this value is modified at runtime, the autoscaler will immediately remove nodes until this constraint
|
|
is satisfied. This may disrupt running workloads.
|
|
|
|
**upscaling_speed[default_value=1.0, min_value=1.0]**: The number of nodes allowed to be pending as a multiple of the current number of nodes.
|
|
For example, if this is set to 1.0, the cluster can grow in size by at most 100% at any time, so if the cluster currently has 20 nodes, at most 20 pending
|
|
launches are allowed. The minimum number of pending launches is 5 regardless of this setting.
|
|
|
|
**idle_timeout_minutes[default_value=5, min_value=0]**: The number of minutes that need to pass before an idle worker node is removed by the
|
|
autoscaler. Worker nodes are idle when they hold no active tasks, actors, or referenced objects (either in-memory or spilled to disk). This parameter does not affect the head node.
|
|
|
|
**available_node_types.<node_type_name>.min_workers[default_value=0, min_value=0]**: The minimum number of worker nodes of a given type to launch. If this number
|
|
is set to greater than zero, the autoscaler will maintain the number of nodes regardless of utilization. The sum of the min worker of all the node types
|
|
must be less than or equal to the max_workers for the cluster.
|
|
|
|
**available_node_types.<node_type_name>.max_workers[default_value=cluster max_workers, min_value=0]**: The maximum number of worker nodes of a given type to launch. This must be
|
|
greater than or equal to available_node_types.<node_type_name>.min_workers. It must be less than or equal to the max_workers for the cluster.
|
|
|
|
.. note::
|
|
If this value is modified at runtime, the autoscaler will immediately remove nodes until this constraint
|
|
is satisfied. This may disrupt running workloads.
|
|
|
|
Autoscaler SDK
|
|
==============
|
|
|
|
For more information on programmatic access to the autoscaler, see :ref:`Autoscaler SDK<ref-autoscaler-sdk-under-construction>`.
|