Tasks or actors can often contend over the same resource or need to communicate with each other. Here are some standard ways to perform synchronization across Ray processes.
Inter-process synchronization using FileLock
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
If you have several tasks or actors writing to the same file or downloading a file on a single node, you can use `FileLock <https://pypi.org/project/filelock/>`_ to synchronize.
This often occurs for data loading and preprocessing.
..code-block:: python
import ray
from filelock import FileLock
@ray.remote
def write_to_file(text):
# Create a filelock object. Consider using an absolute path for the lock.
with FileLock("my_data.txt.lock"):
with open("my_data.txt","a") as f:
f.write(text)
ray.init()
ray.get([write_to_file.remote("hi there!\n") for i in range(3)])
with open("my_data.txt") as f:
print(f.read())
## Output is:
# hi there!
# hi there!
# hi there!
Multi-node synchronization using ``SignalActor``
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
When you have multiple tasks that need to wait on some condition, you can use a ``SignalActor`` to coordinate.
..code-block:: python
# Also available via `from ray.test_utils import SignalActor`
import ray
import asyncio
@ray.remote(num_cpus=0)
class SignalActor:
def __init__(self):
self.ready_event = asyncio.Event()
def send(self, clear=False):
self.ready_event.set()
if clear:
self.ready_event.clear()
async def wait(self, should_wait=True):
if should_wait:
await self.ready_event.wait()
@ray.remote
def wait_and_go(signal):
ray.get(signal.wait.remote())
print("go!")
ray.init()
signal = SignalActor.remote()
tasks = [wait_and_go.remote(signal) for _ in range(4)]
For example, here we instantiate many copies of the same actor with varying resource requirements. Note that to create these actors successfully, Ray will need to be started with sufficient CPU resources and the relevant custom resources:
Ray supports resource specific accelerator types. The `accelerator_type` field can be used to force to a task to run on a node with a specific type of accelerator. Under the hood, the accelerator type option is implemented as a custom resource demand of ``"accelerator_type:<type>": 0.001``. This forces the task to be placed on a node with that particular accelerator type available. This also lets the multi-node-type autoscaler know that there is demand for that type of resource, potentially triggering the launch of new nodes providing that accelerator.
**One limitation** is that the definition of ``f`` must come before the
definitions of ``g`` and ``h`` because as soon as ``g`` is defined, it
will be pickled and shipped to the workers, and so if ``f`` hasn't been
defined yet, the definition will be incomplete.
Circular Dependencies
---------------------
Consider the following remote function.
..code-block:: python
@ray.remote(num_cpus=1, num_gpus=1)
def g():
return ray.get(f.remote())
When a ``g`` task is executing, it will release its CPU resources when it gets
blocked in the call to ``ray.get``. It will reacquire the CPU resources when
``ray.get`` returns. It will retain its GPU resources throughout the lifetime of
the task because the task will most likely continue to use GPU memory.
Cython Code in Ray
------------------
To use Cython code in Ray, run the following from directory ``$RAY_HOME/examples/cython``:
..code-block:: bash
pip install scipy # For BLAS example
pip install -e .
python cython_main.py --help
You can import the ``cython_examples`` module from a Python script or interpreter.
Notes
~~~~~
* You **must** include the following two lines at the top of any ``*.pyx`` file:
..code-block:: python
#!python
# cython: embedsignature=True, binding=True
* You cannot decorate Cython functions within a ``*.pyx`` file (there are ways around this, but creates a leaky abstraction between Cython and Python that would be very challenging to support generally). Instead, prefer the following in your Python code:
* You cannot transfer memory buffers to a remote function (see ``example8``, which currently fails); your remote function must return a value
* Have a look at ``cython_main.py``, ``cython_simple.pyx``, and ``setup.py`` for examples of how to call, define, and build Cython code, respectively. The Cython `documentation <http://cython.readthedocs.io/>`_ is also very helpful.
* Several limitations come from Cython's own `unsupported <https://github.com/cython/cython/wiki/Unsupported>`_ Python features.
* We currently do not support compiling and distributing Cython code to ``ray`` clusters. In other words, Cython developers are responsible for compiling and distributing any Cython code to their cluster (much as would be the case for users who need Python packages like ``scipy``).
* For most simple use cases, developers need not worry about Python 2 or 3, but users who do need to care can have a look at the ``language_level`` Cython compiler directive (see `here <http://cython.readthedocs.io/en/latest/src/reference/compilation.html>`_).