{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"(air-serving-guide)=\n",
"\n",
"# Deploying Predictors with Serve\n",
"\n",
"[Ray Serve](rayserve) is the recommended tool to deploy models trained with AIR.\n",
"\n",
"After training a model with Ray Train, you can serve a model using Ray Serve. In this guide, we will cover how to use Ray AIR's `PredictorDeployment`, `Predictor`, and `Checkpoint` abstractions to quickly deploy a model for online inference.\n",
"\n",
"But before that, let's review the key concepts:\n",
"- [`Checkpoint`](ray.air.checkpoint) represents a trained model stored in memory, file, or remote uri.\n",
"- [`Predictor`](ray.train.predictor.Predictor)s understand how to perform a model inference given checkpoints and the model definition. Ray AIR comes with predictors for each supported frameworks. \n",
"- [`Deployment`](serve-key-concepts-deployment) is a Ray Serve construct that represent an HTTP endpoint along with scalable pool of models.\n",
"\n",
"The core concept for model deployment is the `PredictorDeployment`. The `PredictorDeployment` takes a [predictor](ray.train.predictor.Predictor) class and a [checkpoint](ray.air.checkpoint) and transforms them into a live HTTP endpoint. \n",
"\n",
"We'll start with a simple quick-start demo showing how you can use the `PredictorDeployment` to deploy your model for online inference."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Let's first make sure Ray AIR is installed. For the quick-start, we'll also use Ray AIR to train and serve a XGBoost model."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"!pip install \"ray[air]\" xgboost scikit-learn"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"You can find the preprocessor and trainer in the [key concepts walk-through](air-key-concepts)."
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"2022-06-02 19:31:31,356\tINFO services.py:1483 -- View the Ray dashboard at \u001b[1m\u001b[32mhttp://127.0.0.1:8265\u001b[39m\u001b[22m\n"
]
},
{
"data": {
"text/html": [
"== Status ==
Current time: 2022-06-02 19:31:48 (running for 00:00:13.38)
Memory usage on this node: 37.9/64.0 GiB
Using FIFO scheduling algorithm.
Resources requested: 0/16 CPUs, 0/0 GPUs, 0.0/25.71 GiB heap, 0.0/2.0 GiB objects
Result logdir: /Users/simonmo/ray_results/XGBoostTrainer_2022-06-02_19-31-34
Number of trials: 1/1 (1 TERMINATED)
Trial name | status | loc | iter | total time (s) | train-logloss | train-error | valid-logloss |
---|---|---|---|---|---|---|---|
XGBoostTrainer_4930d_00000 | TERMINATED | 127.0.0.1:60303 | 5 | 8.72108 | 0.190254 | 0.035176 | 0.20535 |