2020-09-14 14:41:06 -07:00
|
|
|
.. _ray-multiprocessing:
|
|
|
|
|
2020-02-18 13:43:19 -08:00
|
|
|
Distributed multiprocessing.Pool
|
|
|
|
================================
|
2019-12-29 21:40:58 -06:00
|
|
|
|
|
|
|
.. _`issue on GitHub`: https://github.com/ray-project/ray/issues
|
|
|
|
|
|
|
|
Ray supports running distributed python programs with the `multiprocessing.Pool API`_
|
|
|
|
using `Ray Actors <actors.html>`__ instead of local processes. This makes it easy
|
|
|
|
to scale existing applications that use ``multiprocessing.Pool`` from a single node
|
|
|
|
to a cluster.
|
|
|
|
|
|
|
|
.. _`multiprocessing.Pool API`: https://docs.python.org/3/library/multiprocessing.html#module-multiprocessing.pool
|
|
|
|
|
|
|
|
Quickstart
|
|
|
|
----------
|
|
|
|
|
2021-01-12 20:35:38 -08:00
|
|
|
To get started, first `install Ray <installation.html>`__, then use
|
2020-02-14 16:17:05 -08:00
|
|
|
``ray.util.multiprocessing.Pool`` in place of ``multiprocessing.Pool``.
|
2019-12-29 21:40:58 -06:00
|
|
|
This will start a local Ray cluster the first time you create a ``Pool`` and
|
|
|
|
distribute your tasks across it. See the `Run on a Cluster`_ section below for
|
|
|
|
instructions to run on a multi-node Ray cluster instead.
|
|
|
|
|
|
|
|
.. code-block:: python
|
|
|
|
|
2020-02-14 16:17:05 -08:00
|
|
|
from ray.util.multiprocessing import Pool
|
2019-12-29 21:40:58 -06:00
|
|
|
|
|
|
|
def f(index):
|
|
|
|
return index
|
|
|
|
|
|
|
|
pool = Pool()
|
|
|
|
for result in pool.map(f, range(100)):
|
|
|
|
print(result)
|
|
|
|
|
|
|
|
The full ``multiprocessing.Pool`` API is currently supported. Please see the
|
|
|
|
`multiprocessing documentation`_ for details.
|
|
|
|
|
2020-01-17 14:57:18 -08:00
|
|
|
.. warning::
|
|
|
|
The ``context`` argument in the ``Pool`` constructor is ignored when using Ray.
|
|
|
|
|
2019-12-29 21:40:58 -06:00
|
|
|
.. _`multiprocessing documentation`: https://docs.python.org/3/library/multiprocessing.html#module-multiprocessing.pool
|
|
|
|
|
|
|
|
Run on a Cluster
|
|
|
|
----------------
|
|
|
|
|
|
|
|
This section assumes that you have a running Ray cluster. To start a Ray cluster,
|
2020-11-23 21:52:36 +01:00
|
|
|
please refer to the `cluster setup <cluster/index.html>`__ instructions.
|
2019-12-29 21:40:58 -06:00
|
|
|
|
|
|
|
To connect a ``Pool`` to a running Ray cluster, you can specify the address of the
|
|
|
|
head node in one of two ways:
|
|
|
|
|
|
|
|
- By setting the ``RAY_ADDRESS`` environment variable.
|
|
|
|
- By passing the ``ray_address`` keyword argument to the ``Pool`` constructor.
|
|
|
|
|
|
|
|
.. code-block:: python
|
|
|
|
|
2020-02-14 16:17:05 -08:00
|
|
|
from ray.util.multiprocessing import Pool
|
2019-12-29 21:40:58 -06:00
|
|
|
|
|
|
|
# Starts a new local Ray cluster.
|
|
|
|
pool = Pool()
|
|
|
|
|
|
|
|
# Connects to a running Ray cluster, with the current node as the head node.
|
|
|
|
# Alternatively, set the environment variable RAY_ADDRESS="auto".
|
|
|
|
pool = Pool(ray_address="auto")
|
|
|
|
|
|
|
|
# Connects to a running Ray cluster, with a remote node as the head node.
|
|
|
|
# Alternatively, set the environment variable RAY_ADDRESS="<ip_address>:<port>".
|
|
|
|
pool = Pool(ray_address="<ip_address>:<port>")
|
|
|
|
|
|
|
|
You can also start Ray manually by calling ``ray.init()`` (with any of its supported
|
|
|
|
configuration options) before creating a ``Pool``.
|