diff --git a/doc/.gitignore b/doc/.gitignore
index 8878d3cc1..4f5c16559 100644
--- a/doc/.gitignore
+++ b/doc/.gitignore
@@ -1,3 +1,4 @@
 # Generated documentation files
 _build
 source/_static/thumbs
+.ipynb_checkpoints/
\ No newline at end of file
diff --git a/doc/source/serve/tutorials/gradio.ipynb b/doc/source/serve/tutorials/gradio.ipynb
new file mode 100644
index 000000000..5216ae79e
--- /dev/null
+++ b/doc/source/serve/tutorials/gradio.ipynb
@@ -0,0 +1,364 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "id": "0c5705f2",
+   "metadata": {},
+   "source": [
+    "(gradio-serve-tutorial)=\n",
+    "\n",
+    "# Building a Gradio demo with Ray Serve\n",
+    "\n",
+    "In this example, we will show you how to wrap a machine learning model served\n",
+    "by Ray Serve in a [Gradio demo](https://gradio.app/).\n",
+    "\n",
+    "Specifically, we're going to download a GPT-2 model from the `transformer` library,\n",
+    "define a Ray Serve deployment with it, and then define and launch a Gradio `Interface`.\n",
+    "Let's take a look."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "c017f8c4",
+   "metadata": {
+    "tags": [
+     "remove-cell"
+    ]
+   },
+   "outputs": [],
+   "source": [
+    "# Install all dependencies for this example.\n",
+    "! pip install ray gradio transformers requests"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "6245b4c3",
+   "metadata": {},
+   "source": [
+    "## Deploying a model with Ray Serve\n",
+    "\n",
+    "To start off, we import Ray Serve, Gradio, the `transformers` and `requests` libraries,\n",
+    "and then simply start Ray Serve:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "79d354ae",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "import gradio as gr\n",
+    "from ray import serve\n",
+    "from transformers import pipeline\n",
+    "import requests\n",
+    "\n",
+    "\n",
+    "serve.start()"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "85b1eba9",
+   "metadata": {},
+   "source": [
+    "Next, we define a Ray Serve deployment with a GPT-2 model, by using the `@serve.deployment` decorator on a `model`\n",
+    "function that takes a `request` argument.\n",
+    "In this function we define a GPT-2 model with a call to `pipeline` and return the result of querying the model."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "6ef8e2c7",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "@serve.deployment\n",
+    "def model(request):\n",
+    "    language_model = pipeline(\"text-generation\", model=\"gpt2\")\n",
+    "    query = request.query_params[\"query\"]\n",
+    "    return language_model(query, max_length=100)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "ba7be609",
+   "metadata": {},
+   "source": [
+    "This `model` can now easily be deployed using a `model.deploy()` call.\n",
+    "To test this deployment we use a simple `example` query to get a `response` from the model running\n",
+    "on `localhost:8000/model`.\n",
+    "The first time you use this endpoint, the model will be downloaded first, which can take a while to complete.\n",
+    "Subsequent calls will be faster."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "c278dfb7",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "model.deploy()\n",
+    "example = \"What's the meaning of life?\"\n",
+    "response = requests.get(f\"http://localhost:8000/model?query={example}\")\n",
+    "print(response.text)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "0b11e675",
+   "metadata": {},
+   "source": [
+    "## Defining and launching a Gradio interface\n",
+    "\n",
+    "Defining a Gradio interface is now straightforward.\n",
+    "All we need is a function that Gradio can call to get the response from the model.\n",
+    "That's just a thin wrapper around our previous `requests` call:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "61c3ab00",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "def gpt2(query):\n",
+    "    response = requests.get(f\"http://localhost:8000/model?query={query}\")\n",
+    "    return response.json()[0][\"generated_text\"]"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "53b4a5ef",
+   "metadata": {},
+   "source": [
+    "Apart from our `gpt2` function, the only other thing that we need to define a Gradio interface is\n",
+    "a description of the model inputs and outputs that Gradio understands.\n",
+    "Since our model takes text as input and output, this turns out to be pretty simple:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "115fb25f",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "iface = gr.Interface(\n",
+    "    fn=gpt2,\n",
+    "    inputs=[gr.inputs.Textbox(\n",
+    "        default=example, label=\"Input prompt\"\n",
+    "    )],\n",
+    "    outputs=[gr.outputs.Textbox(label=\"Model output\")]\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "2e998109",
+   "metadata": {},
+   "source": [
+    "For more complex models served with Ray, you might need multiple `gr.inputs`\n",
+    "and `gr.outputs` of different types.\n",
+    "\n",
+    "```{margin}\n",
+    "The [Gradio documentation](https://gradio.app/docs/) covers all viable input and output components in detail.\n",
+    "```\n",
+    "\n",
+    "Finally, we can launch the interface using `iface.launch()`:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "203ce70e",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "iface.launch()"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "5e5638a9",
+   "metadata": {},
+   "source": [
+    "This should launch an interface that you can interact with that looks like this:\n",
+    "\n",
+    "```{image} https://raw.githubusercontent.com/ray-project/images/master/docs/serve/gradio_serve_gpt.png\n",
+    "```\n",
+    "\n",
+    "You can run this examples directly in the browser, for instance by launching this notebook directly\n",
+    "into Google Colab or Binder, by clicking on the _rocket icon_ at the top right of this page.\n",
+    "If you run this code locally in Python, this Gradio app will be served on `http://127.0.0.1:7861/`.\n",
+    "\n",
+    "## Building a Gradio app from a Scikit-Learn model\n",
+    "\n",
+    "Let's take a look at another example, so that you can see the slight differences to the first example\n",
+    "in direct comparison."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "0fdc6b92",
+   "metadata": {
+    "tags": [
+     "remove-cell"
+    ]
+   },
+   "outputs": [],
+   "source": [
+    "# Install all dependencies for this example.\n",
+    "! pip install ray gradio requests scikit-learn"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "257744c8",
+   "metadata": {},
+   "source": [
+    "This time we're going to use a [Scikit-Learn](https://scikit-learn.org/) model that we quickly train\n",
+    "ourselves on the famous Iris dataset.\n",
+    "To do this, we'll download the Iris dataset using the built-in `load_iris` function from the `sklearn` library,\n",
+    "and we used the `GradientBoostingClassifier` from the `sklearn.ensemble` module for training.\n",
+    "\n",
+    "This time we'll use the `@serve.deployment` decorator on a _class_ called `BoostingModel`, which has an\n",
+    "asynchronous `__call__` method that Ray Serve needs to define your deployment.\n",
+    "All else remains the same as in the first example."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "cb92f167",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "import gradio as gr\n",
+    "import requests\n",
+    "from sklearn.datasets import load_iris\n",
+    "from sklearn.ensemble import GradientBoostingClassifier\n",
+    "\n",
+    "from ray import serve\n",
+    "\n",
+    "# Train your model.\n",
+    "iris_dataset = load_iris()\n",
+    "model = GradientBoostingClassifier()\n",
+    "model.fit(iris_dataset[\"data\"], iris_dataset[\"target\"])\n",
+    "\n",
+    "# Start Ray Serve.\n",
+    "serve.start()\n",
+    "\n",
+    "# Define your deployment.\n",
+    "@serve.deployment(route_prefix=\"/iris\")\n",
+    "class BoostingModel:\n",
+    "    def __init__(self, model):\n",
+    "        self.model = model\n",
+    "        self.label_list = iris_dataset[\"target_names\"].tolist()\n",
+    "\n",
+    "    async def __call__(self, request):\n",
+    "        payload = (await request.json())[\"vector\"]\n",
+    "        print(f\"Received http request with data {payload}\")\n",
+    "\n",
+    "        prediction = self.model.predict([payload])[0]\n",
+    "        human_name = self.label_list[prediction]\n",
+    "        return {\"result\": human_name}\n",
+    "\n",
+    "\n",
+    "# Deploy your model.\n",
+    "BoostingModel.deploy(model)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "30c3ef21",
+   "metadata": {},
+   "source": [
+    "Equipped with our `BoostingModel` class, we can now define and launch a Gradio interface as follows.\n",
+    "The Iris dataset has a total of four features, namely the four numeric values _sepal length_, _sepal width_,\n",
+    "_petal length_, and _petal width_.\n",
+    "We use this fact to define an `iris` function that takes these four features and returns the predicted class,\n",
+    "using our deployed model.\n",
+    "This time, the Gradio interface takes four input `Number`s, and returns the predicted class as `text`.\n",
+    "Go ahead and try it out in the browser yourself."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "733fb4f5",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# Define gradio function\n",
+    "def iris(sl, sw, pl, pw):\n",
+    "    request_input = {\"vector\": [sl, sw, pl, pw]}\n",
+    "    response = requests.get(\n",
+    "        \"http://localhost:8000/iris\", json=request_input)\n",
+    "    return response.json()[0][\"result\"]\n",
+    "\n",
+    "\n",
+    "# Define gradio interface\n",
+    "iface = gr.Interface(\n",
+    "    fn=iris,\n",
+    "    inputs=[\n",
+    "        gr.inputs.Number(default=1.0, label=\"sepal length (cm)\"),\n",
+    "        gr.inputs.Number(default=1.0, label=\"sepal width (cm)\"),\n",
+    "        gr.inputs.Number(default=1.0, label=\"petal length (cm)\"),\n",
+    "        gr.inputs.Number(default=1.0, label=\"petal width (cm)\"),\n",
+    "        ],\n",
+    "    outputs=\"text\")\n",
+    "\n",
+    "# Launch the gradio interface\n",
+    "iface.launch()"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "a3e47ff7",
+   "metadata": {
+    "pycharm": {
+     "name": "#%% md\n"
+    }
+   },
+   "source": [
+    "Launching this interface, you should see an interactive interface that looks like this:\n",
+    "\n",
+    "```{image} https://raw.githubusercontent.com/ray-project/images/master/docs/serve/gradio_serve_iris.png\n",
+    "```\n",
+    "\n",
+    "## Conclusion\n",
+    "\n",
+    "To summarize, it's easy to build Gradio apps from Ray Serve deployments.\n",
+    "You only need to properly encode your model's inputs and outputs in a Gradio interface, and you're good to go!"
+   ]
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3 (ipykernel)",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.7.12"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 5
+}
\ No newline at end of file
diff --git a/doc/source/serve/tutorials/index.rst b/doc/source/serve/tutorials/index.rst
index 419818c17..716e48f53 100644
--- a/doc/source/serve/tutorials/index.rst
+++ b/doc/source/serve/tutorials/index.rst
@@ -10,12 +10,13 @@ Ray Serve functionality and how to integrate different modeling frameworks.
    :name: serve-tutorials
    :maxdepth: -1
 
-   tensorflow.rst
-   pytorch.rst
-   sklearn.rst
-   batch.rst
-   web-server-integration.rst
+   tensorflow
+   pytorch
+   sklearn
+   batch
+   web-server-integration
    rllib
+   gradio
 
 Other Topics: