diff --git a/doc/source/_toc.yml b/doc/source/_toc.yml
index 3396fc461..1b37568c7 100644
--- a/doc/source/_toc.yml
+++ b/doc/source/_toc.yml
@@ -267,6 +267,7 @@ parts:
             - file: cluster/kubernetes/user-guides/configuring-autoscaling.md
             - file: cluster/kubernetes/user-guides/logging.md
             - file: cluster/kubernetes/user-guides/gpu.md
+            - file: cluster/kubernetes/user-guides/experimental.md
         - file: cluster/kubernetes/examples
           sections:
             - file: cluster/kubernetes/examples/ml-example.md
diff --git a/doc/source/cluster/kubernetes/user-guides.md b/doc/source/cluster/kubernetes/user-guides.md
index 3cba63bb5..fd7a7be6d 100644
--- a/doc/source/cluster/kubernetes/user-guides.md
+++ b/doc/source/cluster/kubernetes/user-guides.md
@@ -13,3 +13,4 @@ deployments of Ray on Kubernetes.
 * {ref}`kuberay-autoscaling`
 * {ref}`kuberay-gpu`
 * {ref}`kuberay-logging`
+* {ref}`kuberay-experimental`
diff --git a/doc/source/cluster/kubernetes/user-guides/experimental.md b/doc/source/cluster/kubernetes/user-guides/experimental.md
new file mode 100644
index 000000000..c2a0b7118
--- /dev/null
+++ b/doc/source/cluster/kubernetes/user-guides/experimental.md
@@ -0,0 +1,47 @@
+(kuberay-experimental)=
+
+# Experimental Features
+
+We provide an overview of new and experimental features available
+for deployments of Ray on Kubernetes.
+
+## RayServices
+
+The `RayService` controller enables fault-tolerant deployments of
+{ref}`Ray Serve <rayserve>` applications on Kubernetes.
+
+If your Ray Serve application enters an unhealthy state, the RayService controller will create a new Ray Cluster.
+Once the new cluster is ready, Ray Serve traffic will be re-routed to the new Ray cluster.
+
+For details, see the guide on {ref}`Kubernetes-based RayServe deployments <serve-in-production-kubernetes>`.
+
+## GCS Fault Tolerance
+
+In addition to the application-level fault-tolerance provided by the RayService controller,
+Ray now supports infrastructure-level fault tolerance for the Ray head pod.
+
+You can set up an external Redis instance as a data store for the Ray head. If the Ray head crashes,
+a new head will be created without restarting the Ray cluster.
+The Ray head's GCS will recover its state from the external Redis instance.
+
+See the {ref}`Ray Serve documentation <serve-head-node-failure>` for more information and
+the [KubeRay docs on GCS Fault Tolerance][KubeFT] for a detailed guide.
+
+## RayJobs
+
+The `RayJob` custom resource consists of two elements:
+1. Configuration for a Ray cluster.
+2. A job, i.e. a Ray program to be executed on the Ray cluster.
+
+To run a Ray job, you create a RayJob CR:
+```shell
+kubectl apply -f rayjob.yaml
+```
+The RayJob controller then creates the Ray cluster and runs the job.
+If you wish, you may configure the Ray cluster to be deleted when the job finishes.
+
+See the [KubeRay docs on RayJobs][KubeJob] for details.
+
+[KubeServe]: https://ray-project.github.io/kuberay/guidance/rayservice/
+[KubeFT]: https://ray-project.github.io/kuberay/guidance/gcs-ft/
+[KubeJob]: https://ray-project.github.io/kuberay/guidance/rayjob/
diff --git a/doc/source/serve/production-guide/failures.md b/doc/source/serve/production-guide/failures.md
index f98eaa7b8..147b9d422 100644
--- a/doc/source/serve/production-guide/failures.md
+++ b/doc/source/serve/production-guide/failures.md
@@ -31,6 +31,7 @@ You can also customize how frequently the health check is run and the timeout af
 >             raise RuntimeError("uh-oh, DB connection is broken.")
 > ```
 
+(serve-head-node-failure)=
 ## Head node failures
 
 By default the Ray head node is a single point of failure: if it crashes, the entire cluster crashes and needs to be restarted.