ray/doc/source/workflows/actors.rst

Virtual Actors
==============

Introduction
------------

Workflows also provides a *virtual actors* abstraction, which can be thought of as syntactic sugar on top of a dynamic workflow. Virtual actors are like Ray actors, but backed by durable storage instead of a running process. You can also launch sub-workflows from the methods of each virtual actor (e.g., train models in parallel). Here is a basic example:

.. code-block:: python

    from ray import workflow
    import ray

    @workflow.virtual_actor
    class Counter:
        def __init__(self, init_val):
            self._val = init_val

        def incr(self, val=1):
            self._val += val
            print(self._val)

        @workflow.virtual_actor.readonly
        def value(self):
            return self._val

    workflow.init()

    # Initialize a Counter actor with id="my_counter".
    counter = Counter.get_or_create("my_counter", 0)

    # Similar to workflow steps, actor methods support:
    # - `run()`, which will return the value
    # - `run_async()`, which will return a ObjectRef
    counter.incr.run(10)
    assert counter.value.run() == 10

    # Non-blocking execution.
    counter.incr.run_async(10)
    counter.incr.run(10)
    assert 30 == ray.get(counter.value.run_async())

In the code above, we define a ``Counter`` virtual actor. When the ``Counter`` is created, its class definition and initial state is logged into storage as a dynamic workflow with ``workflow_id="my_counter"``. When actor methods are called, new steps  are dynamically appended to the workflow and executed, returning the new actor state and result.

``__dict__`` in virtual actors must be able to json serializable, otherwise ``__getstate__`` and ``__setstate__`` must be defined, which will be called on each step to restore and save the actor.

We can retrieve the actor via its ``workflow_id`` in another process, to get the value:

.. code-block:: python

    counter = workflow.get_actor(workflow_id="counter")
    assert 30 == counter.value.run()

Readonly methods are not only lower overhead since they skip action logging, but can be executed concurrently with respect to mutating methods on the actor.

Launching sub-workflows from actor methods
------------------------------------------

Inside virtual actor methods, sub-workflow involving other methods of the virtual actor can be launched. These sub-workflows can also include workflow steps defined outside the actor class, for example:

.. code-block:: python

    @workflow.step
    def double(s):
        return 2 * s

    @workflow.virtual_actor
    class Actor:
        def __init__(self):
            self.val = 1

        def double(self, update):
            step = double.step(self.val)
            if not update:
                # inside the method, a workflow can be launched
                return step
            else:
                # workflow can also be passed to anthoer method
                return self.update.step(step)

        def update(self, v):
            self.val = v
            return self.val


    handler = Actor.get_or_create("actor")
    assert handler.double.run(False) == 2
    assert handler.double.run(False) == 2
    assert handler.double.run(True) == 2
    assert handler.double.run(True) == 4

Actor method ordering
---------------------

Workflow virtual actors provide similar ordering guarantees as Ray actors: the methods will be executed in the same order as they are submitted, provided they are submitted from the same thread. This applies both to ``.run()`` (trivially true) and ``.run_async()```, and is also guaranteed to hold under cluster failures. Hence, you can use actor methods as a short-lived queue of work to process for the actor.

When an actor method launches a sub-workflow, that entire sub-workflow will be run as part of the actor method step. This means all steps of the sub-workflow will be guaranteed to complete before any other queued actor method calls are run. However, note that the sub-workflow is not transactional, that is, read-only methods can read intermediate actor state written by steps of the sub-workflow.

Long-lived sub-workflows
------------------------

We do not recommend running long-lived workflows as sub-workflows of a virtual actor. This is because sub-workflows block future actor methods calls from executing while they are running. Instead, you can launch a *separate* workflow and track its execution using workflow API methods. By generating the workflow id deterministically (ensuring idempotency), no duplicate workflows will be launched even if there is a failure.

.. code-block:: python
    :caption: Long-lived sub-workflow (bad).

    @workflow.virtual_actor
    class ShoppingCart:
        ...
        # BAD: blocks until shipping completes, which could be
        # slow. Until that workflow finishes, no mutating methods
        # can be called on this actor.
        def do_checkout():
            # Run shipping workflow as sub-workflow of this method.
            return ship_items.step(self.items)

.. code-block:: python
    :caption: Launching separate workflows (good).

    @workflow.virtual_actor
    class ShoppingCart:
        ...
        # GOOD: the checkout method is non-blocking, and the shipment
        # status can be monitored via ``self.shipment_workflow_id``.
        def do_checkout():
            # Deterministically generate a workflow id for idempotency.
            self.shipment_workflow_id = "ship_{}".format(self.order_id)
            # Run shipping workflow as a separate async workflow.
            ship_items.step(self.items).run_async(
                workflow_id=self.shipment_workflow_id)

Receiving external events
-------------------------

**Note: This feature is not yet implemented.**
Initial version of workflow documentation (#18138) 2021-08-27 16:20:48 -07:00			`Virtual Actors`
			`==============`

			`Introduction`
			`------------`

			`Workflows also provides a virtual actors abstraction, which can be thought of as syntactic sugar on top of a dynamic workflow. Virtual actors are like Ray actors, but backed by durable storage instead of a running process. You can also launch sub-workflows from the methods of each virtual actor (e.g., train models in parallel). Here is a basic example:`

			`.. code-block:: python`

			`from ray import workflow`
			`import ray`

			`@workflow.virtual_actor`
			`class Counter:`
			`def __init__(self, init_val):`
			`self._val = init_val`

			`def incr(self, val=1):`
			`self._val += val`
			`print(self._val)`

			`@workflow.virtual_actor.readonly`
			`def value(self):`
			`return self._val`

			`workflow.init()`

			`# Initialize a Counter actor with id="my_counter".`
			`counter = Counter.get_or_create("my_counter", 0)`

			`# Similar to workflow steps, actor methods support:`
			# - `run()`, which will return the value
			# - `run_async()`, which will return a ObjectRef
			`counter.incr.run(10)`
			`assert counter.value.run() == 10`

			`# Non-blocking execution.`
			`counter.incr.run_async(10)`
			`counter.incr.run(10)`
			`assert 30 == ray.get(counter.value.run_async())`

			In the code above, we define a ``Counter`` virtual actor. When the ``Counter`` is created, its class definition and initial state is logged into storage as a dynamic workflow with ``workflow_id="my_counter"``. When actor methods are called, new steps are dynamically appended to the workflow and executed, returning the new actor state and result.

[workflow] Update documentation (#18522) 2021-09-11 13:40:09 -07:00			``__dict__`` in virtual actors must be able to json serializable, otherwise ``__getstate__`` and ``__setstate__`` must be defined, which will be called on each step to restore and save the actor.
Initial version of workflow documentation (#18138) 2021-08-27 16:20:48 -07:00
			We can retrieve the actor via its ``workflow_id`` in another process, to get the value:

			`.. code-block:: python`

			`counter = workflow.get_actor(workflow_id="counter")`
			`assert 30 == counter.value.run()`

			`Readonly methods are not only lower overhead since they skip action logging, but can be executed concurrently with respect to mutating methods on the actor.`

			`Launching sub-workflows from actor methods`
			`------------------------------------------`

Polish workflows doc, add semantics and best practices for sub-workflows (#18525) 2021-09-12 11:08:06 -07:00			`Inside virtual actor methods, sub-workflow involving other methods of the virtual actor can be launched. These sub-workflows can also include workflow steps defined outside the actor class, for example:`
[workflow] Update documentation (#18522) 2021-09-11 13:40:09 -07:00
			`.. code-block:: python`

			`@workflow.step`
			`def double(s):`
			`return 2 * s`

			`@workflow.virtual_actor`
			`class Actor:`
			`def __init__(self):`
			`self.val = 1`

			`def double(self, update):`
			`step = double.step(self.val)`
			`if not update:`
			`# inside the method, a workflow can be launched`
			`return step`
			`else:`
			`# workflow can also be passed to anthoer method`
			`return self.update.step(step)`

			`def update(self, v):`
			`self.val = v`
			`return self.val`


			`handler = Actor.get_or_create("actor")`
			`assert handler.double.run(False) == 2`
			`assert handler.double.run(False) == 2`
			`assert handler.double.run(True) == 2`
			`assert handler.double.run(True) == 4`
Polish workflows doc, add semantics and best practices for sub-workflows (#18525) 2021-09-12 11:08:06 -07:00
			`Actor method ordering`
			`---------------------`

			Workflow virtual actors provide similar ordering guarantees as Ray actors: the methods will be executed in the same order as they are submitted, provided they are submitted from the same thread. This applies both to ``.run()`` (trivially true) and ``.run_async()```, and is also guaranteed to hold under cluster failures. Hence, you can use actor methods as a short-lived queue of work to process for the actor.

			`When an actor method launches a sub-workflow, that entire sub-workflow will be run as part of the actor method step. This means all steps of the sub-workflow will be guaranteed to complete before any other queued actor method calls are run. However, note that the sub-workflow is not transactional, that is, read-only methods can read intermediate actor state written by steps of the sub-workflow.`

			`Long-lived sub-workflows`
			`------------------------`

			`We do not recommend running long-lived workflows as sub-workflows of a virtual actor. This is because sub-workflows block future actor methods calls from executing while they are running. Instead, you can launch a separate workflow and track its execution using workflow API methods. By generating the workflow id deterministically (ensuring idempotency), no duplicate workflows will be launched even if there is a failure.`

			`.. code-block:: python`
			`:caption: Long-lived sub-workflow (bad).`

			`@workflow.virtual_actor`
			`class ShoppingCart:`
			`...`
			`# BAD: blocks until shipping completes, which could be`
			`# slow. Until that workflow finishes, no mutating methods`
			`# can be called on this actor.`
			`def do_checkout():`
			`# Run shipping workflow as sub-workflow of this method.`
			`return ship_items.step(self.items)`

			`.. code-block:: python`
			`:caption: Launching separate workflows (good).`

			`@workflow.virtual_actor`
			`class ShoppingCart:`
			`...`
			`# GOOD: the checkout method is non-blocking, and the shipment`
			# status can be monitored via ``self.shipment_workflow_id``.
			`def do_checkout():`
			`# Deterministically generate a workflow id for idempotency.`
			`self.shipment_workflow_id = "ship_{}".format(self.order_id)`
			`# Run shipping workflow as a separate async workflow.`
			`ship_items.step(self.items).run_async(`
			`workflow_id=self.shipment_workflow_id)`
Initial version of workflow documentation (#18138) 2021-08-27 16:20:48 -07:00
			`Receiving external events`
			`-------------------------`

			`Note: This feature is not yet implemented.`