ray/doc/source/ray-more-libs/multiprocessing.rst
Max Pumperla f9b71a8bf6
[docs] new structure (#21776)
This PR consolidates both #21667 and #21759 (look there for features), but improves on them in the following way:

- [x] we reverted renaming of existing projects `tune`, `rllib`, `train`, `cluster`, `serve`, `raysgd` and `data` so that links won't break. I think my consolidation efforts with the `ray-` prefix were a little overeager in that regard. It's better like this. Only the creation of `ray-core` was a necessity, and some files moved into the `rllib` folder, so that should be relatively benign.
- [x] Additionally, we added Algolia `docsearch`, screenshot below. This is _much_ better than our current search. Caveat: there's a sphinx dependency that needs to be replaced (`sphinx-tabs`) by another, newer one (`sphinx-panels`), as the former prevents loading of the `algolia.js` library. Will follow-up in the next PR (hoping this one doesn't get re-re-re-re-reverted).
2022-01-21 15:42:05 -08:00

71 lines
2.5 KiB
ReStructuredText

.. _ray-multiprocessing:
Distributed multiprocessing.Pool
================================
.. _`issue on GitHub`: https://github.com/ray-project/ray/issues
Ray supports running distributed python programs with the `multiprocessing.Pool API`_
using `Ray Actors <actors.html>`__ instead of local processes. This makes it easy
to scale existing applications that use ``multiprocessing.Pool`` from a single node
to a cluster.
.. _`multiprocessing.Pool API`: https://docs.python.org/3/library/multiprocessing.html#module-multiprocessing.pool
Quickstart
----------
To get started, first `install Ray <installation.html>`__, then use
``ray.util.multiprocessing.Pool`` in place of ``multiprocessing.Pool``.
This will start a local Ray cluster the first time you create a ``Pool`` and
distribute your tasks across it. See the `Run on a Cluster`_ section below for
instructions to run on a multi-node Ray cluster instead.
.. code-block:: python
from ray.util.multiprocessing import Pool
def f(index):
return index
pool = Pool()
for result in pool.map(f, range(100)):
print(result)
The full ``multiprocessing.Pool`` API is currently supported. Please see the
`multiprocessing documentation`_ for details.
.. warning::
The ``context`` argument in the ``Pool`` constructor is ignored when using Ray.
.. _`multiprocessing documentation`: https://docs.python.org/3/library/multiprocessing.html#module-multiprocessing.pool
Run on a Cluster
----------------
This section assumes that you have a running Ray cluster. To start a Ray cluster,
please refer to the `cluster setup <cluster/index.html>`__ instructions.
To connect a ``Pool`` to a running Ray cluster, you can specify the address of the
head node in one of two ways:
- By setting the ``RAY_ADDRESS`` environment variable.
- By passing the ``ray_address`` keyword argument to the ``Pool`` constructor.
.. code-block:: python
from ray.util.multiprocessing import Pool
# Starts a new local Ray cluster.
pool = Pool()
# Connects to a running Ray cluster, with the current node as the head node.
# Alternatively, set the environment variable RAY_ADDRESS="auto".
pool = Pool(ray_address="auto")
# Connects to a running Ray cluster, with a remote node as the head node.
# Alternatively, set the environment variable RAY_ADDRESS="<ip_address>:<port>".
pool = Pool(ray_address="<ip_address>:<port>")
You can also start Ray manually by calling ``ray.init()`` (with any of its supported
configuration options) before creating a ``Pool``.