[kubernetes][docs] Implement landing page and getting started guide (#26912)

Implements a landing page for the new KubeRay-based deployment guide. Implements a "Getting started" Jupyter notebook
2025-03-05 10:01:43 -05:00 · 2022-07-26 00:41:56 -07:00 · 2022-07-26 00:41:56 -07:00 · a70ada7341
commit a70ada7341
parent 262ce1acef
8 changed files with 673 additions and 125 deletions
--- a/doc/source/cluster/cloud.rst
+++ b/doc/source/cluster/cloud.rst
@ -9,7 +9,7 @@ This section provides instructions for configuring the Ray Cluster Launcher to u

 See this blog post for a `step by step guide`_ to using the Ray Cluster Launcher.

-To learn about deploying Ray on an existing Kubernetes cluster, refer to the guide :ref:`here<ray-k8s-deploy>`.
+To learn about deploying Ray on an existing Kubernetes cluster, refer to the guide :ref:`here<kuberay-index>`.

 .. _`step by step guide`: https://medium.com/distributed-computing-with-ray/a-step-by-step-guide-to-scaling-your-first-python-application-in-the-cloud-8761fe331ef1

--- a/doc/source/cluster/index.rst
+++ b/doc/source/cluster/index.rst
@ -17,7 +17,7 @@ done so often), but the real power is using Ray on a cluster of machines.
 Ray can automatically interact with the cloud provider to request or release
 instances. You can specify :ref:`a configuration <cluster-config>` to launch
 clusters on :ref:`AWS, GCP, Azure (community-maintained), Aliyun (community-maintained), on-premise, or even on
-your custom node provider <cluster-cloud>`. Ray can also be run on :ref:`Kubernetes <ray-k8s-deploy>` infrastructure.
+your custom node provider <cluster-cloud>`. Ray can also be run on :ref:`Kubernetes <kuberay-index>` infrastructure.
 Your cluster can have a fixed size
 or :ref:`automatically scale up and down<cluster-autoscaler>` depending on the
 demands of your application.
--- a/doc/source/cluster/key-concepts.rst
+++ b/doc/source/cluster/key-concepts.rst
@ -95,7 +95,7 @@ Cluster managers
 ----------------

 You can simplify the process of managing Ray clusters using a number of popular
-cluster managers including :ref:`Kubernetes<ray-k8s-deploy>`,
+cluster managers including :ref:`Kubernetes<kuberay-index>`,
 :ref:`YARN<ray-yarn-deploy>`, :ref:`Slurm<ray-slurm-deploy>` and :ref:`LSF<ray-LSF-deploy>`.

 Kubernetes (K8s) operator
@ -104,4 +104,4 @@ Kubernetes (K8s) operator
 Deployments of Ray on Kubernetes are managed by the Ray Kubernetes Operator. The
 Ray Operator makes it easy to deploy clusters of Ray pods within a Kubernetes
 cluster. To learn more about the K8s operator, refer to
-the :ref:`documentation<ray-operator>`.
+the :ref:`documentation<kuberay-index>`.
--- a/doc/source/cluster/kuberay.md
+++ b/doc/source/cluster/kuberay.md
@ -1,11 +1,40 @@
-(kuberay-index)=
-
 # Ray on Kubernetes
+(kuberay-index)=
+## Overview

-:::{warning}
-This page is under construction!
-:::
+You can execute your distributed Ray programs on a Kubernetes cluster.

-High-level description and link to the quickstart.
-Links to the KubeRay GitHub and KubeRay docs.
-Links to the rest of the subpages.
+The [KubeRay Operator](https://ray-project.github.io/kuberay/components/operator/) provides a Kubernetes-native
+interface for managing Ray clusters. Each Ray cluster consist of a head pod and collection of worker pods.
+Optional autoscaling support allows the KubeRay Operator to size your Ray clusters according to the requirements
+of your Ray workload, adding and removing Ray pods as needed.
+
+## Learn More
+
+The Ray docs present all the information you need to start running Ray workloads on Kubernetes.
+
+```{eval-rst}
+.. panels::
+    :container: text-center
+    :column: col-lg-12 p-2
+    :card:
+
+    **Getting started**
+    ^^^
+
+    Learn how to start a Ray cluster and deploy Ray applications on Kubernetes.
+
+    +++
+    .. link-button:: kuberay-quickstart
+        :type: ref
+        :text: Get Started with Ray on Kubernetes
+        :classes: btn-outline-info btn-block
+```
+## The KubeRay project
+
+Ray's Kubernetes support is developed at the [KubeRay GitHub repository](https://github.com/ray-project/kuberay), under the broader [Ray project](https://github.com/ray-project/).
+
+- Visit the [KubeRay GitHub repo](https://github.com/ray-project/kuberay) to track progress, report bugs, propose new features, or contribute to
+the project.
+- Check out the [KubeRay docs](https://ray-project.github.io/kuberay/) for further technical information, developer guides,
+and discussion of new and upcoming features.
--- a/doc/source/cluster/kuberay/quickstart.ipynb
+++ b/doc/source/cluster/kuberay/quickstart.ipynb
@ -0,0 +1,630 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "id": "f34f1b75",
+   "metadata": {},
+   "source": [
+    "(kuberay-quickstart)=\n",
+    "# Getting Started\n",
+    "\n",
+    "In this guide, we show you how to manage and interact with Ray clusters on Kubernetes.\n",
+    "\n",
+    "You can download this guide as an executable Jupyter notebook by clicking the download button on the top right of the page.\n",
+    "\n",
+    "\n",
+    "## Preparation\n",
+    "\n",
+    "### Install the latest Ray release\n",
+    "This step is needed to interact with remote Ray clusters using {ref}`Ray Job Submission <kuberay-job>` and {ref}`Ray Client <kuberay-client>`."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "6dcd7d93",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "! pip install -U \"ray[default]\""
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "656a0707",
+   "metadata": {},
+   "source": [
+    "See {ref}`installation` for more details. "
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "c0933e2f",
+   "metadata": {},
+   "source": [
+    "### Install kubectl\n",
+    "\n",
+    "We will use kubectl to interact with Kubernetes. Find installation instructions at the [Kubernetes documentation](https://kubernetes.io/docs/tasks/tools/#kubectl).\n",
+    "\n",
+    "### Access a Kubernetes cluster\n",
+    "\n",
+    "We will need access to a Kubernetes cluster. There are two options:\n",
+    "1. Configure access to a remote Kubernetes cluster\n",
+    "**OR**\n",
+    "\n",
+    "2. Run the examples locally by [installing kind](https://kind.sigs.k8s.io/docs/user/quick-start/#installation). Start your [kind](https://kind.sigs.k8s.io/) cluster by running the following command:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "c764b3ad",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "! kind create cluster"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "278726e0",
+   "metadata": {},
+   "source": [
+    "To run the example in this guide, make sure your Kubernetes cluster (or local Kind cluster) can accomodate\n",
+    "additional resource requests of 3 CPU and 2Gi memory. \n",
+    "\n",
+    "## Deploying the KubeRay operator\n",
+    "\n",
+    "Deploy the KubeRay Operator by cloning the KubeRay repo and applying the relevant configuration files from the master branch.  "
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "43c66922",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# After the KubeRay 0.3.0 release branch cut, this documentation will be updated to refer to the 0.3.0 branch.\n",
+    "! git clone https://github.com/ray-project/kuberay\n",
+    "\n",
+    "# This creates the KubeRay operator and all of the resources it needs.\n",
+    "! kubectl create -k kuberay/ray-operator/config/default\n",
+    "\n",
+    "# Note that we must use \"kubectl create\" in the above command. \"kubectl apply\" will not work due to https://github.com/ray-project/kuberay/issues/271"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "522f9a97",
+   "metadata": {},
+   "source": [
+    "Confirm that the operator is running in the namespace `ray-system`."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "dfec8bba",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "! kubectl -n ray-system get pod --selector=app.kubernetes.io/component=kuberay-operator\n",
+    "\n",
+    "# NAME                                READY   STATUS    RESTARTS   AGE\n",
+    "# kuberay-operator-557c6c8bcd-t9zkz   1/1     Running   0          XXs"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "3f0d3d17",
+   "metadata": {},
+   "source": [
+    "### Namespace-scoped operator\n",
+    "\n",
+    "Note that the above command deploys the operator at _Kubernetes cluster scope_; the operator will manage resources in all Kubernetes namespaces.\n",
+    "**If your use-case requires running the operator at single namespace scope**, refer to [the instructions at the KubeRay docs](https://github.com/ray-project/kuberay#single-namespace-version)."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "e1fdf3f5",
+   "metadata": {},
+   "source": [
+    "## Deploying a Ray Cluster"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "dac860db",
+   "metadata": {},
+   "source": [
+    "Once the KubeRay operator is running, we are ready to deploy a Ray cluster. To do so, we create a RayCluster Custom Resource (CR).\n",
+    "\n",
+    "In the rest of this guide, we will deploy resources into the default namespace. To use a non-default namespace, specify the namespace in your kubectl commands:\n",
+    "\n",
+    "`kubectl -n <your-namespace> ...`"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "30645643",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# Deploy the Ray Cluster CR:\n",
+    "! kubectl apply -f kuberay/ray-operator/config/samples/ray-cluster.autoscaler.yaml\n",
+    "\n",
+    "# This Ray cluster is named `raycluster-autoscaler` because it has optional Ray Autoscaler support enabled."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "1b6abf52",
+   "metadata": {},
+   "source": [
+    "Once the RayCluster CR has been created, you can view it by running"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "cb2363bc",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "! kubectl get raycluster\n",
+    "\n",
+    "# NAME                    AGE\n",
+    "# raycluster-autoscaler   XXs"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "d4bd4e47",
+   "metadata": {},
+   "source": [
+    "The KubeRay operator will detect the RayCluster object. The operator will then start your Ray cluster by creating head and worker pods. To view Ray cluster's pods, run the following command:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "48d938b2",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# View the pods in the Ray cluster named \"raycluster-autoscaler\"\n",
+    "! kubectl get pods --selector=ray.io/cluster=raycluster-autoscaler\n",
+    "\n",
+    "# NAME                                             READY   STATUS    RESTARTS   AGE\n",
+    "# raycluster-autoscaler-head-xxxxx                 2/2     Running   0          XXs\n",
+    "# raycluster-autoscaler-worker-small-group-yyyyy   1/1     Running   0          XXs"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "2fb1c582",
+   "metadata": {},
+   "source": [
+    "We see a Ray head pod with two containers -- the Ray container and autoscaler sidecar. We also have a Ray worker with its single Ray container.\n",
+    "\n",
+    "Wait for the pods to reach Running state. This may take a few minutes -- most of this time is spent downloading the Ray images. In a separate shell, you may wish to observe the pods' status in real-time with the following command:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "68dab885",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# If you're on MacOS, first `brew install watch`.\n",
+    "# Run in a separate shell:\n",
+    "! watch -n 1 kubectl get pod"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "b63e1ab9",
+   "metadata": {},
+   "source": [
+    "## Interacting with a Ray Cluster\n",
+    "\n",
+    "Now, let's interact with the Ray cluster we've deployed.\n",
+    "\n",
+    "### Accessing the cluster with kubectl exec\n",
+    "\n",
+    "The most straightforward way to experiment with your Ray cluster is to\n",
+    "exec directly into the head pod. First, identify your Ray cluster's head pod:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "2538c4fd",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "! kubectl get pods --selector=ray.io/cluster=raycluster-autoscaler --selector=ray.io/node-type=head -o custom-columns=POD:metadata.name --no-headers\n",
+    "    \n",
+    "# raycluster-autoscaler-head-xxxxx"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "190b2163",
+   "metadata": {},
+   "source": [
+    "Now, we can run a Ray program on the head pod. The Ray program in the next cell asks the autoscaler to scale the cluster to a total of 3 CPUs. The head and worker in our example cluster each have a capacity of 1 CPU, so the request should trigger upscaling of an additional worker pod.\n",
+    "\n",
+    "Note that in real-life scenarios, you will want to use larger Ray pods. In fact, it is advantageous to size each Ray pod to take up an entire Kubernetes node. See the {ref}`configuration guide<kuberay-config>` for more details."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "c35b2454",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# Substitute your output from the last cell in place of \"raycluster-autoscaler-head-xxxxx\"\n",
+    "\n",
+    "! kubectl exec raycluster-autoscaler-head-xxxxx -it -c ray-head -- python -c \"import ray; ray.init(); ray.autoscaler.sdk.request_resources(num_cpus=3)\""
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "b1d81b29",
+   "metadata": {},
+   "source": [
+    "### Autoscaling\n",
+    "\n",
+    "The last command should have triggered Ray pod upscaling. To confirm the new worker pod is up, let's query the RayCluster's pods again:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "37842ea9",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "! kubectl get pod --selector=ray.io/cluster=raycluster-autoscaler\n",
+    "\n",
+    "# NAME                                             READY   STATUS    RESTARTS   AGE\n",
+    "# raycluster-autoscaler-head-xxxxx                 2/2     Running   0          XXs\n",
+    "# raycluster-autoscaler-worker-small-group-yyyyy   1/1     Running   0          XXs\n",
+    "# raycluster-autoscaler-worker-small-group-zzzzz   1/1     Running   0          XXs "
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "222c64f4",
+   "metadata": {},
+   "source": [
+    "To get a summary of your cluster's status, run `ray status` on your cluster's Ray head node."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "6fa373aa",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# Substitute your head pod's name in place of \"raycluster-autoscaler-head-xxxxx\"\n",
+    "! kubectl exec raycluster-autoscaler-head-xxxxx -it -c ray-head -- ray status\n",
+    "\n",
+    "# ======== Autoscaler status: 2022-07-21 xxxxxxxxxx ========\n",
+    "# ...."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "b56e07a3",
+   "metadata": {},
+   "source": [
+    "Alternatively, to examine the full autoscaling logs, fetch the stdout of the Ray head pod's autoscaler sidecar:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "4ec93304",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# This command gets the last 20 lines of autoscaler logs.\n",
+    "\n",
+    "# Substitute your head pod's name in place of \"raycluster-autoscaler-head-xxxxx\"\n",
+    "! kubectl logs raycluster-autoscaler-head-xxxxx -c autoscaler | tail -n 20\n",
+    "\n",
+    "# ======== Autoscaler status: 2022-07-21 xxxxxxxxxx ========\n",
+    "# ..."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "40d0a503",
+   "metadata": {},
+   "source": [
+    "### The Ray head service"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "c6e257f2",
+   "metadata": {},
+   "source": [
+    "The KubeRay operator configures a [Kubernetes service](https://kubernetes.io/docs/concepts/services-networking/service/) targeting the Ray head pod. This service allows us to interact with Ray clusters without directly executing commands in the Ray container. To identify the Ray head service for our example cluster, run"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "d3dae5fd",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "! kubectl get service raycluster-autoscaler-head-svc\n",
+    "\n",
+    "# NAME                             TYPE        CLUSTER-IP     EXTERNAL-IP   PORT(S)                       AGE\n",
+    "# raycluster-autoscaler-head-svc   ClusterIP   10.96.114.20   <none>        6379/TCP,8265/TCP,10001/TCP   XXs"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "e29c48ba",
+   "metadata": {},
+   "source": [
+    "(kuberay-job)=\n",
+    "### Ray Job submission"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "243ad524",
+   "metadata": {},
+   "source": [
+    "Ray provides a [Job Submission API](https://docs.ray.io/en/master/cluster/job-submission.html#ray-job-submission) which can be used to submit Ray workloads to a remote Ray cluster. The Ray Job Submission server listens on the Ray head's Dashboard port, 8265 by default. Let's access the dashboard port via port-forwarding. \n",
+    "\n",
+    "Note: The following port-forwarding command is blocking. If you are following along from a Jupyter notebook, the command must be executed in a separate shell outside of the notebook."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "23c113b6",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# Execute this in a separate shell.\n",
+    "! kubectl port-forward service/raycluster-autoscaler-head-svc 8265:8265"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "ec66649c",
+   "metadata": {},
+   "source": [
+    "Note: We use port-forwarding in this guide as a simple way to experiment with a Ray cluster's services. For production use-cases, you would typically either \n",
+    "- Access the service from within the Kubernetes cluster or\n",
+    "- Use an ingress controller to expose the service outside the cluster.\n",
+    "\n",
+    "See the {ref}`networking notes <kuberay-networking>` for details.\n",
+    "\n",
+    "Now that we have access to the Dashboard port, we can submit jobs to the Ray Cluster:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "28a6bca6",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# The following job's logs will show the Ray cluster's total resource capacity, including 3 CPUs.\n",
+    "\n",
+    "! ray job submit --address http://localhost:8265 -- python -c \"import ray; ray.init(); print(ray.cluster_resources())\""
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "c5d52948",
+   "metadata": {},
+   "source": [
+    "### Viewing the Ray Dashboard\n",
+    "\n",
+    "Assuming the port-forwarding process described above is still running, you may view the {ref}`ray-dashboard` by visiting `localhost:8265` in you browser.\n",
+    "\n",
+    "The dashboard port will not be used in the rest of this guide. You may stop the port-forwarding process if you wish.\n",
+    "\n",
+    "(kuberay-client)=\n",
+    "### Accessing the cluster using Ray Client\n",
+    "\n",
+    "[Ray Client](https://docs.ray.io/en/latest/cluster/ray-client.html) allows you to interact programatically with a remote Ray cluster using the core Ray APIs.\n",
+    "To try out Ray Client, first make sure your local Ray version and Python minor version match the versions used in your Ray cluster. The Ray cluster in our example is running Ray 2.0.0 and Python 3.7, so that's what we'll need locally. If you have a different local Python version and would like to avoid changing it, you can modify the images specified in the yaml file `ray-cluster.autoscaler.yaml`. For example, use `rayproject/ray:2.0.0-py38` for Python 3.8.\n",
+    "\n",
+    "After confirming the Ray and Python versions match up, the next step is to port-forward the Ray Client server port (10001 by default).\n",
+    "If you are following along in a Jupyter notebook, execute the following command in a separate shell."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "d1f4154f",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# Execute this in a separate shell.\n",
+    "! kubectl port-forward service/raycluster-autoscaler-head-svc 10001:10001"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "6ab3bdfe",
+   "metadata": {},
+   "source": [
+    "Now that we have port-forwarding set up, we can connect to the Ray Client from a local Python shell as follows:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "c30245c2",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "import ray\n",
+    "import platform\n",
+    "\n",
+    "ray.init(\"ray://localhost:10001\")\n",
+    "\n",
+    "# The network name of the local machine.\n",
+    "local_host_name = platform.node()\n",
+    "\n",
+    "# This is a Ray task.\n",
+    "# The task will returns the name of the Ray pod that executes it.\n",
+    "@ray.remote\n",
+    "def get_host_name():\n",
+    "    return platform.node()\n",
+    "\n",
+    "# The task will be scheduled on the head node.\n",
+    "# Thus, this variable will hold the head pod's name.\n",
+    "remote_host_name = ray.get(get_host_name.remote())\n",
+    "\n",
+    "print(\"The local host name is {}\".format(local_host_name))\n",
+    "print(\"The Ray head pod's name is {}\".format(remote_host_name))\n",
+    "\n",
+    "# Disconnect from Ray.\n",
+    "ray.shutdown()"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "b87a4f2e",
+   "metadata": {},
+   "source": [
+    "## Cleanup\n",
+    "\n",
+    "### Deleting a Ray Cluster\n",
+    "\n",
+    "To delete the Ray Cluster we deployed in this example, you can run either of the following commands."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "71a34ee3",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# Delete by reference to the RayCluster custom resource\n",
+    "! kubectl delete raycluster raycluster-autoscaler"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "d7aa0221",
+   "metadata": {},
+   "source": [
+    "**OR**"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "112e6d2e",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# Delete by reference to the yaml file we used to define the RayCluster CR \n",
+    "! kubectl delete -f kuberay/ray-operator/config/samples/ray-cluster.autoscaler.yaml"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "0de87d9d",
+   "metadata": {},
+   "source": [
+    "Confirm that the Ray Cluster's pods are gone by running"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "1b557ac5",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "! kubectl get pods"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "6cc5dfff",
+   "metadata": {},
+   "source": [
+    "Note that it may take several seconds for the Ray pods to be fully terminated.\n",
+    "\n",
+    "### Deleting the KubeRay operator\n",
+    "\n",
+    "In typical operation, the KubeRay operator should be left as a long-running process that manages many Ray clusters.\n",
+    "If you would like to delete the operator and associated resources, run"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "24c371a4",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "! kubectl delete -k kuberay/ray-operator/config/default"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "99a76c96",
+   "metadata": {},
+   "source": [
+    "### Deleting a local kind cluster\n",
+    "\n",
+    "Finally, if you'd like to delete your local kind cluster, run"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "f7d4d6e6",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "! kind delete cluster"
+   ]
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3 (ipykernel)",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.7.13"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 5
+}
--- a/doc/source/cluster/kuberay/quickstart.md
+++ b/doc/source/cluster/kuberay/quickstart.md
@ -1,111 +0,0 @@
-(kuberay-quickstart)=
-
-# Quickstart
-
-:::{warning}
-This page is under construction!
-:::
-
-Here we describe how you can deploy an autoscaling Ray cluster on KubeRay. The following instructions are for
-Minikube but the deployment works the same way on a real Kubernetes cluster. You need to have at
-least 4 CPUs to run this example. First we make sure Minikube is initialized with
-
-```shell
-minikube start
-```
-
-Now you can deploy the KubeRay operator using
-
-```shell
-./ray/python/ray/autoscaler/kuberay/init-config.sh
-kubectl create -k "ray/python/ray/autoscaler/kuberay/config/default"
-```
-
-```{admonition} Use kubectl create
-Note `kubectl apply` will not work in the above command. `kubectl create` is required. See [KubeRay issue #271](https://github.com/ray-project/kuberay/issues/271).
-```
-
-You can verify that the operator has been deployed using
-
-```shell
-kubectl -n ray-system get pods
-```
-
-Now let's deploy a new Ray cluster:
-
-```shell
-kubectl create -f ray/python/ray/autoscaler/kuberay/ray-cluster.complete.yaml
-```
-
-## Using the autoscaler
-
-Let's now try out the autoscaler. We can run the following command to get a
-Python interpreter in the head pod:
-
-```shell
-kubectl exec `kubectl get pods -o custom-columns=POD:metadata.name | grep raycluster-complete-head` -it -c ray-head -- python
-```
-
-In the Python interpreter, run the following snippet to scale up the cluster:
-
-```python
-import ray.autoscaler.sdk
-ray.init()
-ray.autoscaler.sdk.request_resources(num_cpus=4)
-```
-
-```{admonition} The Ray autoscaler image.
-The example config ray-cluster.complete.yaml specifies rayproject/ray:8c5fe4
-as the Ray autoscaler image. This image carries the latest improvements to KubeRay autoscaling
-support. This autoscaler image is confirmed to be compatible with Ray versions >= 1.11.0.
-Once Ray autoscaler support is stable, the recommended pattern will be to use the same
-Ray version in the autoscaler and Ray containers.
-```
-
-## Uninstalling the KubeRay operator
-
-You can uninstall the KubeRay operator using
-```shell
-kubectl delete -f "ray/python/ray/autoscaler/kuberay/kuberay-autoscaler-rbac.yaml"
-kubectl delete -k "ray/python/ray/autoscaler/kuberay/config/default"
-```
-
-Note that all running Ray clusters will automatically be terminated.
-
-## Further details on Ray autoscaler support.
-
-Check out the [KubeRay documentation](https://ray-project.github.io/kuberay/guidance/autoscaler/)
-for more details on Ray autoscaler support.
-
-## Developing the KubeRay integration (advanced)
-
-### Developing the KubeRay operator
-If you also want to change the underlying KubeRay operator, please refer to the instructions
-in [the KubeRay development documentation](https://github.com/ray-project/kuberay/blob/master/ray-operator/DEVELOPMENT.md). In that case you should push the modified operator to your docker account or registry and
-follow the instructions in `ray/python/ray/autoscaler/kuberay/init-config.sh`.
-
-### Developing the Ray autoscaler code
-Code for the Ray autoscaler's KubeRay integration is located in `ray/python/ray/autoscaler/_private/kuberay`.
-
-Here is one procedure to test development autoscaler code.
-1. Push autoscaler code changes to your fork of Ray.
-2. Use the following Dockerfile to build an image with your changes.
-```dockerfile
-# Use the latest Ray master as base.
-FROM rayproject/ray:nightly
-# Invalidate the cache so that fresh code is pulled in the next step.
-ARG BUILD_DATE
-# Retrieve your development code.
-RUN git clone -b <my-dev-branch> https://github.com/<my-git-handle>/ray
-# Install symlinks to your modified Python code.
-RUN python ray/python/ray/setup-dev.py -y
-```
-3. Push the image to your docker account or registry. Assuming your Dockerfile is named "Dockerfile":
-```shell
-docker build --build-arg BUILD_DATE=$(date +%Y-%m-%d:%H:%M:%S) -t <registry>/<repo>:<tag> - < Dockerfile
-docker push <registry>/<repo>:<tag>
-```
-4. Update the autoscaler image in `ray-cluster.complete.yaml`
-
-Refer to the [Ray development documentation](https://docs.ray.io/en/latest/development.html#building-ray-python-only) for
-further details.
--- a/doc/source/cluster/kubernetes-manual.rst
+++ b/doc/source/cluster/kubernetes-manual.rst
@ -9,7 +9,7 @@ Deploying a Static Ray Cluster on Kubernetes

 This document gives an example of how to manually deploy a non-autoscaling Ray cluster on Kubernetes.

- Learn about deploying an autoscaling Ray cluster using the :ref:`Ray Helm chart<ray-k8s-deploy>`.
+- Learn about deploying an autoscaling Ray cluster using the :ref:`Ray Helm chart<kuberay-index>`.

 Creating a Ray Namespace
 ------------------------
--- a/doc/source/serve/deploying-serve.md
+++ b/doc/source/serve/deploying-serve.md
@ -88,7 +88,7 @@ In order to deploy Ray Serve on Kubernetes, we need to do the following:
 2. Expose the head node of the cluster as a [Service].
 3. Start Ray Serve on the cluster.

-There are multiple ways to start a Ray cluster on Kubernetes, see {ref}`ray-k8s-deploy` for more information.
+There are multiple ways to start a Ray cluster on Kubernetes, see {ref}`kuberay-index` for more information.
 Here, we will be using the [Ray Cluster Launcher](cluster-cloud) tool, which has support for Kubernetes as a backend.

 The cluster launcher takes in a yaml config file that describes the cluster.