hiro/ray

mirror of https://github.com/vale981/ray synced 2025-03-11 21:56:39 -04:00

shrekris-anyscale d809d748cf

[Serve] [Docs] Add consolidated Model Composition user guide (#26860 )

This change adds introductory deployment graph documentation.

Links to updated documentation:
* [Model Composition](https://ray--26860.org.readthedocs.build/en/26860/serve/model_composition.html)
* [Examples Overview](https://ray--26860.org.readthedocs.build/en/26860/serve/tutorials/index.html)
* [Deployment Graph Pattern Overview](https://ray--26860.org.readthedocs.build/en/26860/serve/tutorials/deployment-graph-patterns.html)
  * [Pattern: Linear Pipeline](https://ray--26860.org.readthedocs.build/en/26860/serve/tutorials/deployment-graph-patterns/linear_pipeline.html)
  * [Pattern: Branching Input](https://ray--26860.org.readthedocs.build/en/26860/serve/tutorials/deployment-graph-patterns/branching_input.html)
  * [Pattern: Conditional](https://ray--26860.org.readthedocs.build/en/26860/serve/tutorials/deployment-graph-patterns/conditional.html)

Co-authored-by: Archit Kulkarni <architkulkarni@users.noreply.github.com>

2022-08-09 17:06:23 -05:00

2.3 KiB

Raw Blame History

Pattern: Conditional

This deployment graph pattern allows you to control your graph's flow using conditionals. You can use this pattern to introduce a dynamic path for your requests to flow through.

Code

:language: python
:start-after: __graph_start__
:end-before: __graph_end__

:::{note} combine takes in intermediate values from the call graph as the individual arguments, value1 and value2. You can also aggregate and pass these intermediate values as a list argument. However, this list contains references to the values, rather than the values themselves. You must explicitly use await to get the actual values before using them. Use await instead of ray.get to avoid blocking the deployment.

For example:

dag = combine.bind([output1, output2], user_input[1])
...
@serve.deployment
async def combine(value_refs, combine_type):
   values = await value_refs
   value1, value2 = values
...

:::

Execution

The graph creates two Model nodes, with weights of 0 and 1. It then takes the user_input and unpacks it into two parts: a number and an operation.

:::{note} dag.execute() can take an arbitrary number of arguments. These arguments can be unpacked by indexing into the InputNode. For example,

with InputNode() as user_input:
   input_number, input_operation = user_input[0], user_input[1]

:::

It passes the number into the two Model nodes, similar to the branching input pattern. Then it passes the requested operation, as well as the intermediate outputs, to the combine deployment to get a final result.

The example script makes two requests to the graph, both with a number input of 1. The resulting calculations are

max:

input = 1
output1 = input + weight_1 = 0 + 1 = 1
output2 = input + weight_2 = 1 + 1 = 2
combine_output = max(output1, output2) = max(1, 2) = 2

sum:

input = 1
output1 = input + weight_1 = 0 + 1 = 1
output2 = input + weight_2 = 1 + 1 = 2
combine_output = output1 + output2 = 1 + 2 = 3

The final outputs are 2 and 3:

$ python conditional.py

2
3

2.3 KiB Raw Blame History

Pattern: Conditional

Code

Execution

2.3 KiB

Raw Blame History