mirror of
https://github.com/vale981/ray
synced 2025-03-05 18:11:42 -05:00
[Serve] [Doc] Create top-level page for Calling Endpoints from HTTP and from Python (#14904)
This commit is contained in:
parent
2e9b065260
commit
03afaed6e1
9 changed files with 273 additions and 241 deletions
|
@ -260,10 +260,10 @@ Papers
|
|||
serve/index.rst
|
||||
serve/tutorial.rst
|
||||
serve/core-apis.rst
|
||||
serve/http-servehandle.rst
|
||||
serve/deployment.rst
|
||||
serve/ml-models.rst
|
||||
serve/advanced-traffic.rst
|
||||
serve/advanced.rst
|
||||
serve/performance.rst
|
||||
serve/architecture.rst
|
||||
serve/tutorials/index.rst
|
||||
|
|
|
@ -92,7 +92,7 @@ The shard key can either be specified via the X-SERVE-SHARD-KEY HTTP header or :
|
|||
# Specifying the shard key via an HTTP header.
|
||||
requests.get("127.0.0.1:8000/api", headers={"X-SERVE-SHARD-KEY": session_id})
|
||||
|
||||
# Specifying the shard key in a call made via serve handle.
|
||||
# Specifying the shard key in a call made via ServeHandle.
|
||||
handle = serve.get_handle("api_endpoint")
|
||||
handler.options(shard_key=session_id).remote(args)
|
||||
|
||||
|
|
|
@ -1,81 +0,0 @@
|
|||
======================================
|
||||
Advanced Topics and Configurations
|
||||
======================================
|
||||
|
||||
Ray Serve has a number of knobs and tools for you to tune for your particular workload.
|
||||
All Ray Serve advanced options and topics are covered on this page aside from the
|
||||
fundamentals of :doc:`deployment`. For a more hands-on take, please check out the :ref:`serve-tutorials`.
|
||||
|
||||
There are a number of things you'll likely want to do with your serving application including
|
||||
scaling out, splitting traffic, or batching input for better performance. To do all of this,
|
||||
you will create a ``BackendConfig``, a configuration object that you'll use to set
|
||||
the properties of a particular backend.
|
||||
|
||||
.. _serve-sync-async-handles:
|
||||
|
||||
Sync and Async Handles
|
||||
======================
|
||||
|
||||
Ray Serve offers two types of ``ServeHandle``. You can use the ``serve.get_handle(..., sync=True|False)``
|
||||
flag to toggle between them.
|
||||
|
||||
- When you set ``sync=True`` (the default), a synchronous handle is returned.
|
||||
Calling ``handle.remote()`` should return a Ray ObjectRef.
|
||||
- When you set ``sync=False``, an asyncio based handle is returned. You need to
|
||||
Call it with ``await handle.remote()`` to return a Ray ObjectRef. To use ``await``,
|
||||
you have to run ``serve.get_handle`` and ``handle.remote`` in Python asyncio event loop.
|
||||
|
||||
The async handle has performance advantage because it uses asyncio directly; as compared
|
||||
to the sync handle, which talks to an asyncio event loop in a thread. To learn more about
|
||||
the reasoning behind these, checkout our `architecture documentation <./architecture.html>`_.
|
||||
|
||||
Configuring HTTP Server Locations
|
||||
=================================
|
||||
|
||||
By default, Ray Serve starts only one HTTP on the head node of the Ray cluster.
|
||||
You can configure this behavior using the ``http_options={"location": ...}`` flag
|
||||
in :mod:`serve.start <ray.serve.start>`:
|
||||
|
||||
- "HeadOnly": start one HTTP server on the head node. Serve
|
||||
assumes the head node is the node you executed serve.start
|
||||
on. This is the default.
|
||||
- "EveryNode": start one HTTP server per node.
|
||||
- "NoServer" or ``None``: disable HTTP server.
|
||||
|
||||
.. note::
|
||||
Using the "EveryNode" option, you can point a cloud load balancer to the
|
||||
instance group of Ray cluster to achieve high availability of Serve's HTTP
|
||||
proxies.
|
||||
|
||||
Variable HTTP Routes
|
||||
====================
|
||||
|
||||
Ray Serve supports capturing path parameters. For example, in a call of the form
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
serve.create_endpoint("my_endpoint", backend="my_backend", route="/api/{username}")
|
||||
|
||||
the ``username`` parameter will be accessible in your backend code as follows:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
def my_backend(request):
|
||||
username = request.path_params["username"]
|
||||
...
|
||||
|
||||
Ray Serve uses Starlette's Router class under the hood for routing, so type
|
||||
conversion for path parameters is also supported, as well as multiple path parameters.
|
||||
For example, suppose this route is used:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
serve.create_endpoint(
|
||||
"complex", backend="f", route="/api/{user_id:int}/{number:float}")
|
||||
|
||||
Then for a query to the route ``/api/123/3.14``, the ``request.path_params`` dictionary
|
||||
available in the backend will be ``{"user_id": 123, "number": 3.14}``, where ``123`` is
|
||||
a Python int and ``3.14`` is a Python float.
|
||||
|
||||
For full details on the supported path parameters, see Starlette's
|
||||
`path parameters documentation <https://www.starlette.io/routing/#path-parameters>`_.
|
|
@ -2,9 +2,9 @@
|
|||
Deploying Ray Serve
|
||||
===================
|
||||
|
||||
In the :doc:`core-apis`, you saw some of the basics of how to write serve applications.
|
||||
In the :doc:`core-apis`, you saw some of the basics of how to write Serve applications.
|
||||
This section will dive deeper into how Ray Serve runs on a Ray cluster and how you're able
|
||||
to deploy and update your serve application over time.
|
||||
to deploy and update your Serve application over time.
|
||||
|
||||
.. contents:: Deploying Ray Serve
|
||||
|
||||
|
|
|
@ -8,157 +8,11 @@ questions, feel free to ask them in the `Discussion Board <https://discuss.ray.i
|
|||
|
||||
.. contents::
|
||||
|
||||
How do I deploy serve?
|
||||
----------------------
|
||||
How do I deploy Ray Serve?
|
||||
--------------------------
|
||||
|
||||
See :doc:`deployment` for information about how to deploy serve.
|
||||
See :doc:`deployment` for information about how to deploy Serve.
|
||||
|
||||
How do I call an endpoint from Python code?
|
||||
-------------------------------------------
|
||||
|
||||
Use :mod:`serve.get_handle <ray.serve.api.get_handle>` to get a handle to the endpoint,
|
||||
then use :mod:`handle.remote <ray.serve.handle.RayServeHandle.remote>` to send requests to that
|
||||
endpoint. This returns a Ray ObjectRef whose result can be waited for or retrieved using
|
||||
``ray.wait`` or ``ray.get``, respectively.
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
handle = serve.get_handle("api_endpoint")
|
||||
ray.get(handle.remote(request))
|
||||
|
||||
|
||||
How do I call a method on my replica besides __call__?
|
||||
------------------------------------------------------
|
||||
|
||||
To call a method via HTTP use the header field ``X-SERVE-CALL-METHOD``.
|
||||
|
||||
To call a method via Python, use :mod:`handle.options <ray.serve.handle.RayServeHandle.options>`:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
class StatefulProcessor:
|
||||
def __init__(self):
|
||||
self.count = 1
|
||||
|
||||
def __call__(self, request):
|
||||
return {"current": self.count}
|
||||
|
||||
def other_method(self, inc):
|
||||
self.count += inc
|
||||
return True
|
||||
|
||||
handle = serve.get_handle("endpoint_name")
|
||||
handle.options(method_name="other_method").remote(5)
|
||||
|
||||
The call is the same as a regular query except a different method is called
|
||||
within the replica.
|
||||
|
||||
How do I use custom status codes in my response?
|
||||
---------------------------------------------------------
|
||||
|
||||
You can return a `Starlette Response object <https://www.starlette.io/responses/>`_ from your backend code:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
from starlette.responses import Response
|
||||
|
||||
def f(starlette_request):
|
||||
return Response('Hello, world!', status_code=123, media_type='text/plain')
|
||||
|
||||
serve.create_backend("hello", f)
|
||||
|
||||
How do I enable CORS and other HTTP features?
|
||||
---------------------------------------------
|
||||
|
||||
Serve supports arbitrary `Starlette middlewares <https://www.starlette.io/middleware/>`_
|
||||
and custom middlewares in Starlette format. The example below shows how to enable
|
||||
`Cross-Origin Resource Sharing (CORS) <https://developer.mozilla.org/en-US/docs/Web/HTTP/CORS>`_.
|
||||
You can follow the same pattern for other Starlette middlewares.
|
||||
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
from starlette.middleware import Middleware
|
||||
from starlette.middleware.cors import CORSMiddleware
|
||||
|
||||
client = serve.start(
|
||||
http_options={"middlewares": [
|
||||
Middleware(
|
||||
CORSMiddleware, allow_origins=["*"], allow_methods=["*"])
|
||||
]})
|
||||
|
||||
|
||||
.. _serve-handle-explainer:
|
||||
|
||||
How do ``ServeHandle`` and ``ServeRequest`` work?
|
||||
---------------------------------------------------
|
||||
|
||||
Ray Serve enables you to query models both from HTTP and Python. This feature
|
||||
enables seamless :ref:`model composition<serve-model-composition>`. You can
|
||||
get a ``ServeHandle`` corresponding to an ``endpoint``, similar how you can
|
||||
reach an endpoint through HTTP via a specific route. When you issue a request
|
||||
to an endpoint through ``ServeHandle``, the request goes through the same code
|
||||
path as an HTTP request would: choosing backends through :ref:`traffic
|
||||
policies <serve-split-traffic>` and load balancing across available replicas.
|
||||
|
||||
When the request arrives in the model, you can access the data similarly to how
|
||||
you would with HTTP request. Here are some examples how ServeRequest mirrors Starlette.Request:
|
||||
|
||||
.. list-table::
|
||||
:header-rows: 1
|
||||
|
||||
* - HTTP
|
||||
- ServeHandle
|
||||
- | Request
|
||||
| (Starlette.Request and ServeRequest)
|
||||
* - ``requests.get(..., headers={...})``
|
||||
- ``handle.options(http_headers={...})``
|
||||
- ``request.headers``
|
||||
* - ``requests.post(...)``
|
||||
- ``handle.options(http_method="POST")``
|
||||
- ``request.method``
|
||||
* - ``requests.get(..., json={...})``
|
||||
- ``handle.remote({...})``
|
||||
- ``await request.json()``
|
||||
* - ``requests.get(..., form={...})``
|
||||
- ``handle.remote({...})``
|
||||
- ``await request.form()``
|
||||
* - ``requests.get(..., params={"a":"b"})``
|
||||
- ``handle.remote(a="b")``
|
||||
- ``request.query_params``
|
||||
* - ``requests.get(..., data="long string")``
|
||||
- ``handle.remote("long string")``
|
||||
- ``await request.body()``
|
||||
* - ``N/A``
|
||||
- ``handle.remote(python_object)``
|
||||
- ``request.data``
|
||||
|
||||
.. note::
|
||||
|
||||
You might have noticed that the last row of the table shows that ServeRequest supports
|
||||
Python object pass through the handle. This is not possible in HTTP. If you
|
||||
need to distinguish if the origin of the request is from Python or HTTP, you can do an ``isinstance``
|
||||
check:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
import starlette.requests
|
||||
|
||||
if isinstance(request, starlette.requests.Request):
|
||||
print("Request coming from web!")
|
||||
elif isinstance(request, ServeRequest):
|
||||
print("Request coming from Python!")
|
||||
|
||||
.. note::
|
||||
|
||||
Once special case is when you pass a web request to a handle.
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
handle.remote(starlette_request)
|
||||
|
||||
In this case, Serve will `not` wrap it in ServeRequest. You can directly
|
||||
process the request as a ``starlette.requests.Request``.
|
||||
|
||||
How fast is Ray Serve?
|
||||
----------------------
|
||||
|
@ -172,8 +26,8 @@ You can checkout our `microbenchmark instruction <https://github.com/ray-project
|
|||
to benchmark on your hardware.
|
||||
|
||||
|
||||
Can I use asyncio along with Ray Serve?
|
||||
---------------------------------------
|
||||
Can I use ``asyncio`` along with Ray Serve?
|
||||
-------------------------------------------
|
||||
Yes! You can make your servable methods ``async def`` and Serve will run them
|
||||
concurrently inside a Python asyncio event loop.
|
||||
|
||||
|
|
255
doc/source/serve/http-servehandle.rst
Normal file
255
doc/source/serve/http-servehandle.rst
Normal file
|
@ -0,0 +1,255 @@
|
|||
==========================================
|
||||
Calling Endpoints via HTTP and ServeHandle
|
||||
==========================================
|
||||
|
||||
.. contents:: Calling Endpoints via HTTP and ServeHandle
|
||||
|
||||
Overview
|
||||
========
|
||||
|
||||
Ray Serve endpoints can be called in two ways: from HTTP and from Python.
|
||||
On this page we will show you both of these approaches and then give a tutorial
|
||||
on how to integrate Ray Serve with an existing web server.
|
||||
|
||||
Calling Endpoints via HTTP
|
||||
==========================
|
||||
|
||||
As described in the :doc:`tutorial`, when you create a Ray Serve endpoint, to
|
||||
serve it over HTTP you just need to specify the ``route`` parameter to ``serve.create_endpoint``:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
serve.create_endpoint("my_endpoint", backend="my_backend", route="/counter")
|
||||
|
||||
Below, we discuss some advanced features for customizing Ray Serve's HTTP functionality:
|
||||
|
||||
Configuring HTTP Server Locations
|
||||
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
||||
|
||||
By default, Ray Serve starts a single HTTP server on the head node of the Ray cluster.
|
||||
You can configure this behavior using the ``http_options={"location": ...}`` flag
|
||||
in :mod:`serve.start <ray.serve.start>`:
|
||||
|
||||
- "HeadOnly": start one HTTP server on the head node. Serve
|
||||
assumes the head node is the node you executed serve.start
|
||||
on. This is the default.
|
||||
- "EveryNode": start one HTTP server per node.
|
||||
- "NoServer" or ``None``: disable HTTP server.
|
||||
|
||||
.. note::
|
||||
Using the "EveryNode" option, you can point a cloud load balancer to the
|
||||
instance group of Ray cluster to achieve high availability of Serve's HTTP
|
||||
proxies.
|
||||
|
||||
Variable HTTP Routes
|
||||
^^^^^^^^^^^^^^^^^^^^
|
||||
|
||||
Ray Serve supports capturing path parameters. For example, in a call of the form
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
serve.create_endpoint("my_endpoint", backend="my_backend", route="/api/{username}")
|
||||
|
||||
the ``username`` parameter will be accessible in your backend code as follows:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
def my_backend(request):
|
||||
username = request.path_params["username"]
|
||||
...
|
||||
|
||||
Ray Serve uses Starlette's Router class under the hood for routing, so type
|
||||
conversion for path parameters is also supported, as well as multiple path parameters.
|
||||
For example, suppose this route is used:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
serve.create_endpoint(
|
||||
"complex", backend="f", route="/api/{user_id:int}/{number:float}")
|
||||
|
||||
Then for a query to the route ``/api/123/3.14``, the ``request.path_params`` dictionary
|
||||
available in the backend will be ``{"user_id": 123, "number": 3.14}``, where ``123`` is
|
||||
a Python int and ``3.14`` is a Python float.
|
||||
|
||||
For full details on the supported path parameters, see Starlette's
|
||||
`path parameters documentation <https://www.starlette.io/routing/#path-parameters>`_.
|
||||
|
||||
Custom HTTP response status codes
|
||||
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
||||
|
||||
You can return a `Starlette Response object <https://www.starlette.io/responses/>`_ from your Ray Serve backend code:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
from starlette.responses import Response
|
||||
|
||||
def f(starlette_request):
|
||||
return Response('Hello, world!', status_code=123, media_type='text/plain')
|
||||
|
||||
serve.create_backend("hello", f)
|
||||
|
||||
Enabling CORS and other HTTP middlewares
|
||||
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
||||
|
||||
Serve supports arbitrary `Starlette middlewares <https://www.starlette.io/middleware/>`_
|
||||
and custom middlewares in Starlette format. The example below shows how to enable
|
||||
`Cross-Origin Resource Sharing (CORS) <https://developer.mozilla.org/en-US/docs/Web/HTTP/CORS>`_.
|
||||
You can follow the same pattern for other Starlette middlewares.
|
||||
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
from starlette.middleware import Middleware
|
||||
from starlette.middleware.cors import CORSMiddleware
|
||||
|
||||
client = serve.start(
|
||||
http_options={"middlewares": [
|
||||
Middleware(
|
||||
CORSMiddleware, allow_origins=["*"], allow_methods=["*"])
|
||||
]})
|
||||
|
||||
.. _serve-handle-explainer:
|
||||
|
||||
ServeHandle: Calling Endpoints from Python
|
||||
================================================
|
||||
|
||||
Ray Serve enables you to query models both from HTTP and Python. This feature
|
||||
enables seamless :ref:`model composition<serve-model-composition>`. You can
|
||||
get a ``ServeHandle`` corresponding to an ``endpoint``, similar how you can
|
||||
reach an endpoint through HTTP via a specific route. When you issue a request
|
||||
to an endpoint through ``ServeHandle``, the request goes through the same code
|
||||
path as an HTTP request would: choosing backends through :ref:`traffic
|
||||
policies <serve-split-traffic>` and load balancing across available replicas.
|
||||
|
||||
To call a Ray Serve endpoint from python, use :mod:`serve.get_handle <ray.serve.api.get_handle>`
|
||||
to get a handle to the endpoint, then use
|
||||
:mod:`handle.remote <ray.serve.handle.RayServeHandle.remote>` to send requests to that
|
||||
endpoint. This returns a Ray ObjectRef whose result can be waited for or retrieved using
|
||||
``ray.wait`` or ``ray.get``, respectively.
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
handle = serve.get_handle("api_endpoint")
|
||||
ray.get(handle.remote(request))
|
||||
|
||||
|
||||
Accessing data from the request
|
||||
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
||||
|
||||
When the request arrives in the model, you can access the data similarly to how
|
||||
you would with an HTTP request. Here are some examples how Ray Serve's built-in
|
||||
``ServeRequest`` mirrors ```starlette.requests.request``:
|
||||
|
||||
.. list-table::
|
||||
:header-rows: 1
|
||||
|
||||
* - HTTP
|
||||
- ServeHandle
|
||||
- | Request
|
||||
| (Starlette.Request and ServeRequest)
|
||||
* - ``requests.get(..., headers={...})``
|
||||
- ``handle.options(http_headers={...})``
|
||||
- ``request.headers``
|
||||
* - ``requests.post(...)``
|
||||
- ``handle.options(http_method="POST")``
|
||||
- ``request.method``
|
||||
* - ``requests.get(..., json={...})``
|
||||
- ``handle.remote({...})``
|
||||
- ``await request.json()``
|
||||
* - ``requests.get(..., form={...})``
|
||||
- ``handle.remote({...})``
|
||||
- ``await request.form()``
|
||||
* - ``requests.get(..., params={"a":"b"})``
|
||||
- ``handle.remote(a="b")``
|
||||
- ``request.query_params``
|
||||
* - ``requests.get(..., data="long string")``
|
||||
- ``handle.remote("long string")``
|
||||
- ``await request.body()``
|
||||
* - ``N/A``
|
||||
- ``handle.remote(python_object)``
|
||||
- ``request.data``
|
||||
|
||||
.. note::
|
||||
|
||||
You might have noticed that the last row of the table shows that ``ServeRequest`` supports
|
||||
passing Python objects through the handle. This is not possible in HTTP. If you
|
||||
need to distinguish if the origin of the request is from Python or HTTP, you can do an ``isinstance``
|
||||
check:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
import starlette.requests
|
||||
|
||||
if isinstance(request, starlette.requests.Request):
|
||||
print("Request coming from web!")
|
||||
elif isinstance(request, ServeRequest):
|
||||
print("Request coming from Python!")
|
||||
|
||||
.. note::
|
||||
|
||||
One special case is when you pass a web request to a handle.
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
handle.remote(starlette_request)
|
||||
|
||||
In this case, Serve will `not` wrap it in ServeRequest. You can directly
|
||||
process the request as a ``starlette.requests.Request``.
|
||||
|
||||
.. _serve-sync-async-handles:
|
||||
|
||||
Sync and Async Handles
|
||||
^^^^^^^^^^^^^^^^^^^^^^
|
||||
|
||||
Ray Serve offers two types of ``ServeHandle``. You can use the ``serve.get_handle(..., sync=True|False)``
|
||||
flag to toggle between them.
|
||||
|
||||
- When you set ``sync=True`` (the default), a synchronous handle is returned.
|
||||
Calling ``handle.remote()`` should return a Ray ObjectRef.
|
||||
- When you set ``sync=False``, an asyncio based handle is returned. You need to
|
||||
Call it with ``await handle.remote()`` to return a Ray ObjectRef. To use ``await``,
|
||||
you have to run ``serve.get_handle`` and ``handle.remote`` in Python asyncio event loop.
|
||||
|
||||
The async handle has performance advantage because it uses asyncio directly; as compared
|
||||
to the sync handle, which talks to an asyncio event loop in a thread. To learn more about
|
||||
the reasoning behind these, checkout our `architecture documentation <./architecture.html>`_.
|
||||
|
||||
.. _serve-custom-methods:
|
||||
|
||||
Calling methods on a Serve backend besides ``__call__``
|
||||
=======================================================
|
||||
|
||||
By default, Ray Serve will serve the user-defined ``__call__`` method of your class, but
|
||||
other methods of your class can be served as well.
|
||||
|
||||
To call a custom method via HTTP, pass in the method name in the header field ``X-SERVE-CALL-METHOD``.
|
||||
|
||||
To call a custom method via Python, use :mod:`handle.options <ray.serve.handle.RayServeHandle.options>`:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
class StatefulProcessor:
|
||||
def __init__(self):
|
||||
self.count = 1
|
||||
|
||||
def __call__(self, request):
|
||||
return {"current": self.count}
|
||||
|
||||
def other_method(self, inc):
|
||||
self.count += inc
|
||||
return True
|
||||
|
||||
handle = serve.get_handle("endpoint_name")
|
||||
handle.options(method_name="other_method").remote(5)
|
||||
|
||||
The call is the same as a regular query except a different method is called
|
||||
within the replica.
|
||||
|
||||
Integrating with existing web servers
|
||||
=====================================
|
||||
|
||||
Ray Serve comes with its own HTTP server out of the box, but if you have an existing
|
||||
web application, you can still plug in Ray Serve to scale up your backend computation.
|
||||
|
||||
Using ``ServeHandle`` makes this easy.
|
||||
For a tutorial with sample code, see :ref:`serve-web-server-integration-tutorial`.
|
|
@ -54,9 +54,13 @@ For our Counter class to work with Ray Serve, it needs to be a *callable* class,
|
|||
self.count += 1
|
||||
return {"count": self.count}
|
||||
|
||||
.. note::
|
||||
.. tip::
|
||||
|
||||
In addition to callable classes, you can also serve functions using Ray Serve.
|
||||
You can also serve :ref:`other class methods<serve-custom-methods>` besides ``__call__``.
|
||||
|
||||
.. note::
|
||||
|
||||
Besides classes, you can also serve standalone functions with Ray Serve in the same way.
|
||||
|
||||
Now we are ready to deploy our class using Ray Serve. First, create a Ray Serve backend and pass in the Counter class:
|
||||
|
||||
|
|
|
@ -594,7 +594,7 @@ class Client:
|
|||
"You are retrieving a sync handle inside an asyncio loop. "
|
||||
"Try getting client.get_handle(.., sync=False) to get better "
|
||||
"performance. Learn more at https://docs.ray.io/en/master/"
|
||||
"serve/advanced.html#sync-and-async-handles")
|
||||
"serve/http-servehandle.html#sync-and-async-handles")
|
||||
|
||||
if not asyncio.get_event_loop().is_running() and not sync:
|
||||
logger.warning(
|
||||
|
@ -602,7 +602,7 @@ class Client:
|
|||
"You should make sure client.get_handle is called inside a "
|
||||
"running event loop. Or call client.get_handle(.., sync=True) "
|
||||
"to create sync handle. Learn more at https://docs.ray.io/en/"
|
||||
"master/serve/advanced.html#sync-and-async-handles")
|
||||
"master/serve/http-servehandle.html#sync-and-async-handles")
|
||||
|
||||
if endpoint_name in all_endpoints:
|
||||
this_endpoint = all_endpoints[endpoint_name]
|
||||
|
|
|
@ -37,7 +37,7 @@ scipy==1.4.1
|
|||
tabulate
|
||||
tensorboardX
|
||||
uvicorn
|
||||
pydantic
|
||||
pydantic>=1.8
|
||||
dataclasses; python_version < '3.7'
|
||||
starlette
|
||||
|
||||
|
|
Loading…
Add table
Reference in a new issue