ray/doc/source/cluster/slurm.rst

.. include:: we_are_hiring.rst

.. _ray-slurm-deploy:

Deploying on Slurm
==================

Slurm usage with Ray can be a little bit unintuitive.

* SLURM requires multiple copies of the same program are submitted multiple times to the same cluster to do cluster programming. This is particularly well-suited for MPI-based workloads.
* Ray, on the other hand, expects a head-worker architecture with a single point of entry. That is, you'll need to start a Ray head node, multiple Ray worker nodes, and run your Ray script on the head node.

.. warning::

    SLURM support is still a work in progress. SLURM users should be aware
    of current limitations regarding networking.
    See :ref:`here <slurm-network-ray>` for more explanations.

    SLURM support is community-maintained. Maintainer GitHub handle: tupui.

This document aims to clarify how to run Ray on SLURM.

.. contents::
  :local:


Walkthrough using Ray with SLURM
--------------------------------

Many SLURM deployments require you to interact with slurm via ``sbatch``, which executes a batch script on SLURM.

To run a Ray job with ``sbatch``, you will want to start a Ray cluster in the sbatch job with multiple ``srun`` commands (tasks), and then execute your python script that uses Ray. Each task will run on a separate node and start/connect to a Ray runtime.

The below walkthrough will do the following:

1. Set the proper headers for the ``sbatch`` script.
2. Load the proper environment/modules.
3. Fetch a list of available computing nodes and their IP addresses.
4. Launch a head ray process in one of the node (called the head node).
5. Launch Ray processes in (n-1) worker nodes and connects them to the head node by providing the head node address.
6. After the underlying ray cluster is ready, submit the user specified task.

See :ref:`slurm-basic.sh <slurm-basic>` for an end-to-end example.

.. _ray-slurm-headers:

sbatch directives
~~~~~~~~~~~~~~~~~

In your sbatch script, you'll want to add `directives to provide context <https://slurm.schedmd.com/sbatch.html>`__ for your job to SLURM.

.. code-block:: bash

  #!/bin/bash
  #SBATCH --job-name=my-workload

You'll need to tell SLURM to allocate nodes specifically for Ray. Ray will then find and manage all resources on each node.

.. code-block:: bash

  ### Modify this according to your Ray workload.
  #SBATCH --nodes=4
  #SBATCH --exclusive

Important: To ensure that each Ray worker runtime will run on a separate node, set ``tasks-per-node``.

.. code-block:: bash

  #SBATCH --tasks-per-node=1

Since we've set `tasks-per-node = 1`, this will be used to guarantee that each Ray worker runtime will obtain the
proper resources. In this example, we ask for at least 5 CPUs and 5 GB of memory per node.

.. code-block:: bash

  ### Modify this according to your Ray workload.
  #SBATCH --cpus-per-task=5
  #SBATCH --mem-per-cpu=1GB
  ### Similarly, you can also specify the number of GPUs per node.
  ### Modify this according to your Ray workload. Sometimes this
  ### should be 'gres' instead.
  #SBATCH --gpus-per-task=1


You can also add other optional flags to your sbatch directives.


Loading your environment
~~~~~~~~~~~~~~~~~~~~~~~~

First, you'll often want to Load modules or your own conda environment at the beginning of the script.

Note that this is an optional step, but it is often required for enabling the right set of dependencies.

.. code-block:: bash

  # Example: module load pytorch/v1.4.0-gpu
  # Example: conda activate my-env

  conda activate my-env

Obtain the head IP address
~~~~~~~~~~~~~~~~~~~~~~~~~~

Next, we'll want to obtain a hostname and a node IP address for the head node. This way, when we start worker nodes, we'll be able to properly connect to the right head node.

.. literalinclude:: /cluster/examples/slurm-basic.sh
   :language: bash
   :start-after: __doc_head_address_start__
   :end-before: __doc_head_address_end__


Starting the Ray head node
~~~~~~~~~~~~~~~~~~~~~~~~~~

After detecting the head node hostname and head node IP, we'll want to create
a Ray head node runtime. We'll do this by using ``srun`` as a background task
as a single task/node (recall that ``tasks-per-node=1``).

Below, you'll see that we explicitly specify the number of CPUs (``num-cpus``)
and number of GPUs (``num-gpus``) to Ray, as this will prevent Ray from using
more resources than allocated. We also need to explictly
indicate the ``node-ip-address`` for the Ray head runtime:

.. literalinclude:: /cluster/examples/slurm-basic.sh
   :language: bash
   :start-after: __doc_head_ray_start__
   :end-before: __doc_head_ray_end__

By backgrounding the above srun task, we can proceed to start the Ray worker runtimes.

Starting the Ray worker nodes
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Below, we do the same thing, but for each worker. Make sure the Ray head and Ray worker processes are not started on the same node.

.. literalinclude:: /cluster/examples/slurm-basic.sh
   :language: bash
   :start-after: __doc_worker_ray_start__
   :end-before: __doc_worker_ray_end__

Submitting your script
~~~~~~~~~~~~~~~~~~~~~~

Finally, you can invoke your Python script:

.. literalinclude:: /cluster/examples/slurm-basic.sh
   :language: bash
   :start-after: __doc_script_start__

.. _slurm-network-ray:

SLURM networking caveats
~~~~~~~~~~~~~~~~~~~~~~~~

There are two important networking aspects to keep in mind when working with
SLURM and Ray:

1. Ports binding.
2. IP binding.

One common use of a SLURM cluster is to have multiple users running concurrent
jobs on the same infrastructure. This can easily conflict with Ray due to the
way the head node communicates with its workers.

Considering 2 users, if they both schedule a SLURM job using Ray
at the same time, they are both creating a head node. In the backend, Ray will
assign some internal ports to a few services. The issue is that as soon as the
first head node is created, it will bind some ports and prevent them to be
used by another head node. To prevent any conflicts, users have to manually
specify non overlapping ranges of ports. The following ports are to be
adjusted. For an explanation on ports, see :ref:`here <ray-ports>`::

    # used for all ports
    --node-manager-port
    --object-manager-port
    --min-worker-port
    --max-worker-port
    # used for the head node
    --port
    --ray-client-server-port
    --redis-shard-ports

For instance, again with 2 users, they would have to adapt the instructions
seen above to:

.. code-block:: bash

  # user 1
  # same as above
  ...
  srun --nodes=1 --ntasks=1 -w "$head_node" \
      ray start --head --node-ip-address="$head_node_ip" \
          --port=6379 \
          --node-manager-port=6700 \
          --object-manager-port=6701 \
          --ray-client-server-port=10001 \
          --redis-shard-ports=6702 \
          --min-worker-port=10002 \
          --max-worker-port=19999 \
          --num-cpus "${SLURM_CPUS_PER_TASK}" --num-gpus "${SLURM_GPUS_PER_TASK}" --block &

  # user 2
  # same as above
  ...
  srun --nodes=1 --ntasks=1 -w "$head_node" \
      ray start --head --node-ip-address="$head_node_ip" \
          --port=6380 \
          --node-manager-port=6800 \
          --object-manager-port=6801 \
          --ray-client-server-port=20001 \
          --redis-shard-ports=6802 \
          --min-worker-port=20002 \
          --max-worker-port=29999 \
          --num-cpus "${SLURM_CPUS_PER_TASK}" --num-gpus "${SLURM_GPUS_PER_TASK}" --block &

As for the IP binding, on some cluster architecture the network interfaces
do not allow to use external IPs between nodes. Instead, there are internal
network interfaces (`eth0`, `eth1`, etc.). Currently, it's difficult to
set an internal IP
(see the open `issue <https://github.com/ray-project/ray/issues/22732>`_).

Python-interface SLURM scripts
------------------------------

[Contributed by @pengzhenghao] Below, we provide a helper utility (:ref:`slurm-launch.py <slurm-launch>`) to auto-generate SLURM scripts and launch.
``slurm-launch.py`` uses an underlying template (:ref:`slurm-template.sh <slurm-template>`) and fills out placeholders given user input.

You can feel free to copy both files into your cluster for use. Feel free to also open any PRs for contributions to improve this script!

Usage example
~~~~~~~~~~~~~

If you want to utilize a multi-node cluster in slurm:

.. code-block:: bash

    python slurm-launch.py --exp-name test --command "python your_file.py" --num-nodes 3

If you want to specify the computing node(s), just use the same node name(s) in the same format of the output of ``sinfo`` command:

.. code-block:: bash

    python slurm-launch.py --exp-name test --command "python your_file.py" --num-nodes 3 --node NODE_NAMES


There are other options you can use when calling ``python slurm-launch.py``:

* ``--exp-name``: The experiment name. Will generate ``{exp-name}_{date}-{time}.sh`` and  ``{exp-name}_{date}-{time}.log``.
* ``--command``: The command you wish to run. For example: ``rllib train XXX`` or ``python XXX.py``.
* ``--num-gpus``: The number of GPUs you wish to use in each computing node. Default: 0.
* ``--node`` (``-w``): The specific nodes you wish to use, in the same form as the output of ``sinfo``. Nodes are automatically assigned if not specified.
* ``--num-nodes`` (``-n``): The number of nodes you wish to use. Default: 1.
* ``--partition`` (``-p``): The partition you wish to use. Default: "", will use user's default partition.
* ``--load-env``: The command to setup your environment. For example: ``module load cuda/10.1``. Default: "".

Note that the :ref:`slurm-template.sh <slurm-template>` is compatible with both IPV4 and IPV6 ip address of the computing nodes.

Implementation
~~~~~~~~~~~~~~

Concretely, the (:ref:`slurm-launch.py <slurm-launch>`) does the following things:

1. It automatically writes your requirements, e.g. number of CPUs, GPUs per node, the number of nodes and so on, to a sbatch script name ``{exp-name}_{date}-{time}.sh``. Your command (``--command``) to launch your own job is also written into the sbatch script.
2. Then it will submit the sbatch script to slurm manager via a new process.
3. Finally, the python process will terminate itself and leaves a log file named ``{exp-name}_{date}-{time}.log`` to record the progress of your submitted command. At the mean time, the ray cluster and your job is running in the slurm cluster.


Examples and templates
----------------------

Here are some community-contributed templates for using SLURM with Ray:

- `Ray sbatch submission scripts`_ used at `NERSC <https://www.nersc.gov/>`_, a US national lab.
- `YASPI`_ (yet another slurm python interface) by @albanie. The goal of yaspi is to provide an interface to submitting slurm jobs, thereby obviating the joys of sbatch files. It does so through recipes - these are collections of templates and rules for generating sbatch scripts. Supports job submissions for Ray.

- `Convenient python interface`_ to launch ray cluster and submit task by @pengzhenghao

.. _`Ray sbatch submission scripts`: https://github.com/NERSC/slurm-ray-cluster

.. _`YASPI`: https://github.com/albanie/yaspi

.. _`Convenient python interface`: https://github.com/pengzhenghao/use-ray-with-slurm
Revert "[docs] Clean up doc structure (first part) (#21667)" (#21763) This reverts commit 38e46c9fb3b46810ea1c270cf95d6e65dc8ccaee. 2022-01-20 15:30:56 -08:00			`.. include:: we_are_hiring.rst`
[docs] autoscaler/K8s hiring roles (#20621) * we are hiring * fixes as philipp requested 2021-11-23 14:56:22 -08:00
[docs] Revised Cluster documentation (#9062) Co-authored-by: Edward Oakes <ed.nmi.oakes@gmail.com> 2020-06-26 09:29:22 -07:00			`.. _ray-slurm-deploy:`

[docs] Added Instructions for Slurm (#5467) * Added Instructions for Slurm Made in response to #826 2019-08-19 00:46:26 -04:00			`Deploying on Slurm`
			`==================`

[docs] Add more guideline on using ray in slurm cluster (#12819) Co-authored-by: Sumanth Ratna <sumanthratna@gmail.com> Co-authored-by: PENG Zhenghao <pengzh@ie.cuhk.edu.hk> Co-authored-by: Richard Liaw <rliaw@berkeley.edu> 2021-01-15 04:17:53 +08:00			`Slurm usage with Ray can be a little bit unintuitive.`
[docs] slurm + progress_bar example (#10782) 2020-09-15 00:16:36 -07:00
[docs] Add more guideline on using ray in slurm cluster (#12819) Co-authored-by: Sumanth Ratna <sumanthratna@gmail.com> Co-authored-by: PENG Zhenghao <pengzh@ie.cuhk.edu.hk> Co-authored-by: Richard Liaw <rliaw@berkeley.edu> 2021-01-15 04:17:53 +08:00			`* SLURM requires multiple copies of the same program are submitted multiple times to the same cluster to do cluster programming. This is particularly well-suited for MPI-based workloads.`
			`* Ray, on the other hand, expects a head-worker architecture with a single point of entry. That is, you'll need to start a Ray head node, multiple Ray worker nodes, and run your Ray script on the head node.`
[docs] slurm + progress_bar example (#10782) 2020-09-15 00:16:36 -07:00
[core][docs] Document port/IP binding and slurm concerns (#22663) Using Ray on SLURM system is documented but missing some pitfalls about network. This PR adds some information about port binding and address binding (I will open a feature request with more and link it here later). I did not put any real recommendation on this last point since `--address` did not work. I had cannot resolve issue after setting an internal IP although it's reachable. 2022-03-15 09:43:46 +01:00			`.. warning::`

			`SLURM support is still a work in progress. SLURM users should be aware`
			`of current limitations regarding networking.`
			See :ref:`here <slurm-network-ray>` for more explanations.

[Docs] Note that certain features are community maintained (#25687) Adds notes explaining that Ray's support on Azure, Aliyun, and SLURM is community-maintained. Rephrases the mention of K8s support in the intro. This PR replaces https://github.com/ray-project/ray/pull/25504. 2022-06-13 16:10:32 -07:00			`SLURM support is community-maintained. Maintainer GitHub handle: tupui.`

[docs] Add more guideline on using ray in slurm cluster (#12819) Co-authored-by: Sumanth Ratna <sumanthratna@gmail.com> Co-authored-by: PENG Zhenghao <pengzh@ie.cuhk.edu.hk> Co-authored-by: Richard Liaw <rliaw@berkeley.edu> 2021-01-15 04:17:53 +08:00			`This document aims to clarify how to run Ray on SLURM.`
[docs] slurm + progress_bar example (#10782) 2020-09-15 00:16:36 -07:00
[docs] Add more guideline on using ray in slurm cluster (#12819) Co-authored-by: Sumanth Ratna <sumanthratna@gmail.com> Co-authored-by: PENG Zhenghao <pengzh@ie.cuhk.edu.hk> Co-authored-by: Richard Liaw <rliaw@berkeley.edu> 2021-01-15 04:17:53 +08:00			`.. contents::`
			`:local:`
[docs] slurm + progress_bar example (#10782) 2020-09-15 00:16:36 -07:00

[docs] Add more guideline on using ray in slurm cluster (#12819) Co-authored-by: Sumanth Ratna <sumanthratna@gmail.com> Co-authored-by: PENG Zhenghao <pengzh@ie.cuhk.edu.hk> Co-authored-by: Richard Liaw <rliaw@berkeley.edu> 2021-01-15 04:17:53 +08:00			`Walkthrough using Ray with SLURM`
			`--------------------------------`
[docs] slurm + progress_bar example (#10782) 2020-09-15 00:16:36 -07:00
[docs] Add more guideline on using ray in slurm cluster (#12819) Co-authored-by: Sumanth Ratna <sumanthratna@gmail.com> Co-authored-by: PENG Zhenghao <pengzh@ie.cuhk.edu.hk> Co-authored-by: Richard Liaw <rliaw@berkeley.edu> 2021-01-15 04:17:53 +08:00			Many SLURM deployments require you to interact with slurm via ``sbatch``, which executes a batch script on SLURM.

			To run a Ray job with ``sbatch``, you will want to start a Ray cluster in the sbatch job with multiple ``srun`` commands (tasks), and then execute your python script that uses Ray. Each task will run on a separate node and start/connect to a Ray runtime.

			`The below walkthrough will do the following:`
[docs] slurm + progress_bar example (#10782) 2020-09-15 00:16:36 -07:00
[docs] Add more guideline on using ray in slurm cluster (#12819) Co-authored-by: Sumanth Ratna <sumanthratna@gmail.com> Co-authored-by: PENG Zhenghao <pengzh@ie.cuhk.edu.hk> Co-authored-by: Richard Liaw <rliaw@berkeley.edu> 2021-01-15 04:17:53 +08:00			1. Set the proper headers for the ``sbatch`` script.
			`2. Load the proper environment/modules.`
			`3. Fetch a list of available computing nodes and their IP addresses.`
			`4. Launch a head ray process in one of the node (called the head node).`
			`5. Launch Ray processes in (n-1) worker nodes and connects them to the head node by providing the head node address.`
			`6. After the underlying ray cluster is ready, submit the user specified task.`
[docs] slurm + progress_bar example (#10782) 2020-09-15 00:16:36 -07:00
[docs] Add more guideline on using ray in slurm cluster (#12819) Co-authored-by: Sumanth Ratna <sumanthratna@gmail.com> Co-authored-by: PENG Zhenghao <pengzh@ie.cuhk.edu.hk> Co-authored-by: Richard Liaw <rliaw@berkeley.edu> 2021-01-15 04:17:53 +08:00			See :ref:`slurm-basic.sh <slurm-basic>` for an end-to-end example.
Streamlined slurm script and removed references to redis_password (#10827) Co-authored-by: Ralph Kube <ralph.kube@uit.not> Co-authored-by: Richard Liaw <rliaw@berkeley.edu> 2020-09-18 17:55:56 -04:00
[docs] Add more guideline on using ray in slurm cluster (#12819) Co-authored-by: Sumanth Ratna <sumanthratna@gmail.com> Co-authored-by: PENG Zhenghao <pengzh@ie.cuhk.edu.hk> Co-authored-by: Richard Liaw <rliaw@berkeley.edu> 2021-01-15 04:17:53 +08:00			`.. _ray-slurm-headers:`

			`sbatch directives`
			`~~~~~~~~~~~~~~~~~`

			In your sbatch script, you'll want to add `directives to provide context <https://slurm.schedmd.com/sbatch.html>`__ for your job to SLURM.
[docs] Added Instructions for Slurm (#5467) * Added Instructions for Slurm Made in response to #826 2019-08-19 00:46:26 -04:00
			`.. code-block:: bash`

			`#!/bin/bash`
[docs] Add more guideline on using ray in slurm cluster (#12819) Co-authored-by: Sumanth Ratna <sumanthratna@gmail.com> Co-authored-by: PENG Zhenghao <pengzh@ie.cuhk.edu.hk> Co-authored-by: Richard Liaw <rliaw@berkeley.edu> 2021-01-15 04:17:53 +08:00			`#SBATCH --job-name=my-workload`

			`You'll need to tell SLURM to allocate nodes specifically for Ray. Ray will then find and manage all resources on each node.`

			`.. code-block:: bash`

			`### Modify this according to your Ray workload.`
Streamlined slurm script and removed references to redis_password (#10827) Co-authored-by: Ralph Kube <ralph.kube@uit.not> Co-authored-by: Richard Liaw <rliaw@berkeley.edu> 2020-09-18 17:55:56 -04:00			`#SBATCH --nodes=4`
[docs] Add more guideline on using ray in slurm cluster (#12819) Co-authored-by: Sumanth Ratna <sumanthratna@gmail.com> Co-authored-by: PENG Zhenghao <pengzh@ie.cuhk.edu.hk> Co-authored-by: Richard Liaw <rliaw@berkeley.edu> 2021-01-15 04:17:53 +08:00			`#SBATCH --exclusive`

			Important: To ensure that each Ray worker runtime will run on a separate node, set ``tasks-per-node``.

			`.. code-block:: bash`

Streamlined slurm script and removed references to redis_password (#10827) Co-authored-by: Ralph Kube <ralph.kube@uit.not> Co-authored-by: Richard Liaw <rliaw@berkeley.edu> 2020-09-18 17:55:56 -04:00			`#SBATCH --tasks-per-node=1`
[docs] Add more guideline on using ray in slurm cluster (#12819) Co-authored-by: Sumanth Ratna <sumanthratna@gmail.com> Co-authored-by: PENG Zhenghao <pengzh@ie.cuhk.edu.hk> Co-authored-by: Richard Liaw <rliaw@berkeley.edu> 2021-01-15 04:17:53 +08:00
			Since we've set `tasks-per-node = 1`, this will be used to guarantee that each Ray worker runtime will obtain the
			`proper resources. In this example, we ask for at least 5 CPUs and 5 GB of memory per node.`

			`.. code-block:: bash`

			`### Modify this according to your Ray workload.`
			`#SBATCH --cpus-per-task=5`
			`#SBATCH --mem-per-cpu=1GB`
			`### Similarly, you can also specify the number of GPUs per node.`
			`### Modify this according to your Ray workload. Sometimes this`
			`### should be 'gres' instead.`
			`#SBATCH --gpus-per-task=1`


			`You can also add other optional flags to your sbatch directives.`


			`Loading your environment`
			`~~~~~~~~~~~~~~~~~~~~~~~~`

			`First, you'll often want to Load modules or your own conda environment at the beginning of the script.`

			`Note that this is an optional step, but it is often required for enabling the right set of dependencies.`

			`.. code-block:: bash`

			`# Example: module load pytorch/v1.4.0-gpu`
			`# Example: conda activate my-env`

			`conda activate my-env`

			`Obtain the head IP address`
			`~~~~~~~~~~~~~~~~~~~~~~~~~~`

			`Next, we'll want to obtain a hostname and a node IP address for the head node. This way, when we start worker nodes, we'll be able to properly connect to the right head node.`

Revert "[docs] Clean up doc structure (first part) (#21667)" (#21763) This reverts commit 38e46c9fb3b46810ea1c270cf95d6e65dc8ccaee. 2022-01-20 15:30:56 -08:00			`.. literalinclude:: /cluster/examples/slurm-basic.sh`
[docs] Add more guideline on using ray in slurm cluster (#12819) Co-authored-by: Sumanth Ratna <sumanthratna@gmail.com> Co-authored-by: PENG Zhenghao <pengzh@ie.cuhk.edu.hk> Co-authored-by: Richard Liaw <rliaw@berkeley.edu> 2021-01-15 04:17:53 +08:00			`:language: bash`
			`:start-after: __doc_head_address_start__`
			`:end-before: __doc_head_address_end__`



			`Starting the Ray head node`
			`~~~~~~~~~~~~~~~~~~~~~~~~~~`

			`After detecting the head node hostname and head node IP, we'll want to create`
			a Ray head node runtime. We'll do this by using ``srun`` as a background task
			as a single task/node (recall that ``tasks-per-node=1``).

			Below, you'll see that we explicitly specify the number of CPUs (``num-cpus``)
			and number of GPUs (``num-gpus``) to Ray, as this will prevent Ray from using
			`more resources than allocated. We also need to explictly`
			indicate the ``node-ip-address`` for the Ray head runtime:

Revert "[docs] Clean up doc structure (first part) (#21667)" (#21763) This reverts commit 38e46c9fb3b46810ea1c270cf95d6e65dc8ccaee. 2022-01-20 15:30:56 -08:00			`.. literalinclude:: /cluster/examples/slurm-basic.sh`
[docs] Add more guideline on using ray in slurm cluster (#12819) Co-authored-by: Sumanth Ratna <sumanthratna@gmail.com> Co-authored-by: PENG Zhenghao <pengzh@ie.cuhk.edu.hk> Co-authored-by: Richard Liaw <rliaw@berkeley.edu> 2021-01-15 04:17:53 +08:00			`:language: bash`
			`:start-after: __doc_head_ray_start__`
			`:end-before: __doc_head_ray_end__`

			`By backgrounding the above srun task, we can proceed to start the Ray worker runtimes.`

			`Starting the Ray worker nodes`
			`~~~~~~~~~~~~~~~~~~~~~~~~~~~~~`

			`Below, we do the same thing, but for each worker. Make sure the Ray head and Ray worker processes are not started on the same node.`

Revert "[docs] Clean up doc structure (first part) (#21667)" (#21763) This reverts commit 38e46c9fb3b46810ea1c270cf95d6e65dc8ccaee. 2022-01-20 15:30:56 -08:00			`.. literalinclude:: /cluster/examples/slurm-basic.sh`
[docs] Add more guideline on using ray in slurm cluster (#12819) Co-authored-by: Sumanth Ratna <sumanthratna@gmail.com> Co-authored-by: PENG Zhenghao <pengzh@ie.cuhk.edu.hk> Co-authored-by: Richard Liaw <rliaw@berkeley.edu> 2021-01-15 04:17:53 +08:00			`:language: bash`
			`:start-after: __doc_worker_ray_start__`
			`:end-before: __doc_worker_ray_end__`

			`Submitting your script`
			`~~~~~~~~~~~~~~~~~~~~~~`

			`Finally, you can invoke your Python script:`

Revert "[docs] Clean up doc structure (first part) (#21667)" (#21763) This reverts commit 38e46c9fb3b46810ea1c270cf95d6e65dc8ccaee. 2022-01-20 15:30:56 -08:00			`.. literalinclude:: /cluster/examples/slurm-basic.sh`
[docs] Add more guideline on using ray in slurm cluster (#12819) Co-authored-by: Sumanth Ratna <sumanthratna@gmail.com> Co-authored-by: PENG Zhenghao <pengzh@ie.cuhk.edu.hk> Co-authored-by: Richard Liaw <rliaw@berkeley.edu> 2021-01-15 04:17:53 +08:00			`:language: bash`
			`:start-after: __doc_script_start__`

[core][docs] Document port/IP binding and slurm concerns (#22663) Using Ray on SLURM system is documented but missing some pitfalls about network. This PR adds some information about port binding and address binding (I will open a feature request with more and link it here later). I did not put any real recommendation on this last point since `--address` did not work. I had cannot resolve issue after setting an internal IP although it's reachable. 2022-03-15 09:43:46 +01:00			`.. _slurm-network-ray:`

			`SLURM networking caveats`
			`~~~~~~~~~~~~~~~~~~~~~~~~`

			`There are two important networking aspects to keep in mind when working with`
			`SLURM and Ray:`

			`1. Ports binding.`
			`2. IP binding.`

			`One common use of a SLURM cluster is to have multiple users running concurrent`
			`jobs on the same infrastructure. This can easily conflict with Ray due to the`
			`way the head node communicates with its workers.`

			`Considering 2 users, if they both schedule a SLURM job using Ray`
			`at the same time, they are both creating a head node. In the backend, Ray will`
			`assign some internal ports to a few services. The issue is that as soon as the`
			`first head node is created, it will bind some ports and prevent them to be`
			`used by another head node. To prevent any conflicts, users have to manually`
			`specify non overlapping ranges of ports. The following ports are to be`
			adjusted. For an explanation on ports, see :ref:`here <ray-ports>`::

			`# used for all ports`
			`--node-manager-port`
			`--object-manager-port`
			`--min-worker-port`
			`--max-worker-port`
			`# used for the head node`
			`--port`
			`--ray-client-server-port`
			`--redis-shard-ports`

			`For instance, again with 2 users, they would have to adapt the instructions`
			`seen above to:`

			`.. code-block:: bash`

			`# user 1`
			`# same as above`
			`...`
			`srun --nodes=1 --ntasks=1 -w "$head_node" \`
			`ray start --head --node-ip-address="$head_node_ip" \`
			`--port=6379 \`
			`--node-manager-port=6700 \`
			`--object-manager-port=6701 \`
			`--ray-client-server-port=10001 \`
			`--redis-shard-ports=6702 \`
			`--min-worker-port=10002 \`
			`--max-worker-port=19999 \`
			`--num-cpus "${SLURM_CPUS_PER_TASK}" --num-gpus "${SLURM_GPUS_PER_TASK}" --block &`

			`# user 2`
			`# same as above`
			`...`
			`srun --nodes=1 --ntasks=1 -w "$head_node" \`
			`ray start --head --node-ip-address="$head_node_ip" \`
			`--port=6380 \`
			`--node-manager-port=6800 \`
			`--object-manager-port=6801 \`
			`--ray-client-server-port=20001 \`
			`--redis-shard-ports=6802 \`
			`--min-worker-port=20002 \`
			`--max-worker-port=29999 \`
			`--num-cpus "${SLURM_CPUS_PER_TASK}" --num-gpus "${SLURM_GPUS_PER_TASK}" --block &`

			`As for the IP binding, on some cluster architecture the network interfaces`
			`do not allow to use external IPs between nodes. Instead, there are internal`
			network interfaces (`eth0`, `eth1`, etc.). Currently, it's difficult to
			`set an internal IP`
			(see the open `issue <https://github.com/ray-project/ray/issues/22732>`_).
[docs] Add more guideline on using ray in slurm cluster (#12819) Co-authored-by: Sumanth Ratna <sumanthratna@gmail.com> Co-authored-by: PENG Zhenghao <pengzh@ie.cuhk.edu.hk> Co-authored-by: Richard Liaw <rliaw@berkeley.edu> 2021-01-15 04:17:53 +08:00
			`Python-interface SLURM scripts`
			`------------------------------`

			[Contributed by @pengzhenghao] Below, we provide a helper utility (:ref:`slurm-launch.py <slurm-launch>`) to auto-generate SLURM scripts and launch.
			``slurm-launch.py`` uses an underlying template (:ref:`slurm-template.sh <slurm-template>`) and fills out placeholders given user input.

			`You can feel free to copy both files into your cluster for use. Feel free to also open any PRs for contributions to improve this script!`

			`Usage example`
			`~~~~~~~~~~~~~`

			`If you want to utilize a multi-node cluster in slurm:`

			`.. code-block:: bash`

			`python slurm-launch.py --exp-name test --command "python your_file.py" --num-nodes 3`

			If you want to specify the computing node(s), just use the same node name(s) in the same format of the output of ``sinfo`` command:

			`.. code-block:: bash`

			`python slurm-launch.py --exp-name test --command "python your_file.py" --num-nodes 3 --node NODE_NAMES`


			There are other options you can use when calling ``python slurm-launch.py``:

			* ``--exp-name``: The experiment name. Will generate ``{exp-name}_{date}-{time}.sh`` and ``{exp-name}_{date}-{time}.log``.
			* ``--command``: The command you wish to run. For example: ``rllib train XXX`` or ``python XXX.py``.
			* ``--num-gpus``: The number of GPUs you wish to use in each computing node. Default: 0.
			* ``--node`` (``-w``): The specific nodes you wish to use, in the same form as the output of ``sinfo``. Nodes are automatically assigned if not specified.
			* ``--num-nodes`` (``-n``): The number of nodes you wish to use. Default: 1.
			* ``--partition`` (``-p``): The partition you wish to use. Default: "", will use user's default partition.
			* ``--load-env``: The command to setup your environment. For example: ``module load cuda/10.1``. Default: "".

			Note that the :ref:`slurm-template.sh <slurm-template>` is compatible with both IPV4 and IPV6 ip address of the computing nodes.

			`Implementation`
			`~~~~~~~~~~~~~~`

			Concretely, the (:ref:`slurm-launch.py <slurm-launch>`) does the following things:

			1. It automatically writes your requirements, e.g. number of CPUs, GPUs per node, the number of nodes and so on, to a sbatch script name ``{exp-name}_{date}-{time}.sh``. Your command (``--command``) to launch your own job is also written into the sbatch script.
			`2. Then it will submit the sbatch script to slurm manager via a new process.`
			3. Finally, the python process will terminate itself and leaves a log file named ``{exp-name}_{date}-{time}.log`` to record the progress of your submitted command. At the mean time, the ray cluster and your job is running in the slurm cluster.


			`Examples and templates`
			`----------------------`

			`Here are some community-contributed templates for using SLURM with Ray:`

			- `Ray sbatch submission scripts`_ used at `NERSC <https://www.nersc.gov/>`_, a US national lab.
			- `YASPI`_ (yet another slurm python interface) by @albanie. The goal of yaspi is to provide an interface to submitting slurm jobs, thereby obviating the joys of sbatch files. It does so through recipes - these are collections of templates and rules for generating sbatch scripts. Supports job submissions for Ray.

			- `Convenient python interface`_ to launch ray cluster and submit task by @pengzhenghao

			.. _`Ray sbatch submission scripts`: https://github.com/NERSC/slurm-ray-cluster

			.. _`YASPI`: https://github.com/albanie/yaspi

			.. _`Convenient python interface`: https://github.com/pengzhenghao/use-ray-with-slurm