[Docs][Kubernetes] Fix link, add a bit of content (#28017)

Signed-off-by: Dmitri Gekhtman <dmitri.m.gekhtman@gmail.com> Fixes the "legacy operator" link to point to master, rather than the 2.0.0 branch. The migration README exists in master but not in the 2.0.0 branch. Adds a sentence explaining that the Ray container has to go first in the container list. Adds a sentence to config guide mention min/max replicas and linking to autoscaling. Documents a bug related to GPU auto-detection in KubeRay 0.3.0.
2025-03-05 10:01:43 -05:00 · 2022-08-26 12:02:18 -07:00 · 2022-08-26 12:02:18 -07:00 · ce99cf1b71
commit ce99cf1b71
parent 96d579a4fe
3 changed files with 45 additions and 9 deletions
--- a/doc/source/cluster/kubernetes/index.md
+++ b/doc/source/cluster/kubernetes/index.md
@ -90,7 +90,7 @@ the project.
 and discussion of new and upcoming features.

 ```{note}
-The KubeRay operator replaces the older Ray operator hosted in the [Ray repository](https://github.com/ray-project/ray/tree/releases/2.0.0/python/ray/ray_operator).
+The KubeRay operator replaces the older Ray operator hosted in the [Ray repository](https://github.com/ray-project/ray/tree/master/python/ray/ray_operator).
 Check the linked README for migration notes.

 If you have used the legacy Ray operator in the past,
--- a/doc/source/cluster/kubernetes/user-guides/config.md
+++ b/doc/source/cluster/kubernetes/user-guides/config.md
@ -121,7 +121,8 @@ specified under `headGroupSpec`, while configuration for worker pods is
 specified under `workerGroupSpecs`. There may be multiple worker groups,
 each group with its own configuration. The `replicas` field
 of a `workerGroupSpec` specifies the number of worker pods of that group to
-keep in the cluster.
+keep in the cluster. Each `workerGroupSpec` also has optional `minReplicas` and
+`maxReplicas` fields; these fields are important if you wish to enable {ref}`autoscaling <kuberay-autoscaling-config>`.

 ### Pod templates
 The bulk of the configuration for a `headGroupSpec` or
@ -129,6 +130,14 @@ The bulk of the configuration for a `headGroupSpec` or
 template which determines the configuration for the pods in the group.
 Here are some of the subfields of the pod `template` to pay attention to:

+#### containers
+A Ray pod template specifies at minimum one container, namely the container
+that runs the Ray processes. A Ray pod template may also specify additional sidecar
+containers, for purposes such as {ref}`log processing <kuberay-logging>`. However, the KubeRay operator assumes that
+the first container in the containers list is the main Ray container.
+Therefore, make sure to specify any sidecar containers
+**after** the main Ray container. In other words, the Ray container should be the **first**
+in the `containers` list.

 #### resources
 It’s important to specify container CPU and memory requests and limits for
@ -153,8 +162,20 @@ Note that CPU quantities will be rounded up to the nearest integer
 before being relayed to Ray.
 The resource capacities advertised to Ray may be overridden in the {ref}`rayStartParams`.

+:::{warning}
+Due to a  [bug](https://github.com/ray-project/kuberay/pull/497) in KubeRay 0.3.0,
+the following piece of configuration is required to advertise the presence of GPUs
+to Ray.
+```yaml
+rayStartParams:
+    num-gpus: "1"
+```
+Future releases of KubeRay will not require this. (GPU quantities will be correctly auto-detected
+from container limits.)
+:::
+
 On the other hand CPU, GPU, and memory **requests** will be ignored by Ray.
-For this reason, it is best when possible to set resource requests equal to resource limits.
+For this reason, it is best when possible to **set resource requests equal to resource limits**.

 #### nodeSelector and tolerations
 You can control the scheduling of worker groups' Ray pods by setting the `nodeSelector` and
@ -209,9 +230,8 @@ Note that the values of all Ray start parameters, including `num-cpus`,
 must be supplied as **strings**.

 ### num-gpus
-This optional field specifies the number of GPUs available to the Ray container.
-In KubeRay versions since 0.3.0, the number of GPUs can be auto-detected from Ray container resource limits.
-For certain advanced use-cases, you may wish to use `num-gpus` to set an {ref}`override <kuberay-gpu-override>`.
+This field specifies the number of GPUs available to the Ray container.
+In future KubeRay versions, the number of GPUs will be auto-detected from Ray container resource limits.
 Note that the values of all Ray start parameters, including `num-gpus`,
 must be supplied as **strings**.

--- a/doc/source/cluster/kubernetes/user-guides/gpu.rst
+++ b/doc/source/cluster/kubernetes/user-guides/gpu.rst
@ -35,6 +35,8 @@ to 5 GPU workers.
 .. code-block:: yaml

   groupName: gpu-group
+   rayStartParams:
+       num-gpus: "1" # Advertise GPUs to Ray.
   replicas: 0
   minReplicas: 0
   maxReplicas: 5
@ -47,17 +49,30 @@ to 5 GPU workers.
           image: rayproject/ray-ml:2.0.0-gpu
           ...
           resources:
-            cpu: 3
-            memory: 50Gi
            nvidia.com/gpu: 1 # Optional, included just for documentation.
-           limits:
            cpu: 3
            memory: 50Gi
+           limits:
            nvidia.com/gpu: 1 # Required to use GPU.
+            cpu: 3
+            memory: 50Gi
            ...

 Each of the Ray pods in the group can be scheduled on an AWS `p2.xlarge` instance (1 GPU, 4vCPU, 61Gi RAM).

+.. warning::
+
+    Not the following piece of required configuration:
+
+    .. code-block:: yaml
+
+        rayStartParams:
+            num-gpus: "1"
+
+    This extra configuration is required due to a `bug`_ in KubeRay 0.3.0.
+    KubeRay master does not require this piece of configuration, nor will future KubeRay releases;
+    the GPU Ray start parameters will be auto-detected from container resource limits.
+
 .. tip::

    GPU instances are expensive -- consider setting up autoscaling for your GPU Ray workers,
@ -215,3 +230,4 @@ and about Nvidia's GPU plugin for Kubernetes `here <https://github.com/NVIDIA/k8
 .. _`admission controller`: https://kubernetes.io/docs/reference/access-authn-authz/admission-controllers/
 .. _`ExtendedResourceToleration`: https://kubernetes.io/docs/reference/access-authn-authz/admission-controllers/#extendedresourcetoleration
 .. _`Kubernetes docs`: https://kubernetes.io/docs/concepts/scheduling-eviction/assign-pod-node/
+.. _`bug`: https://github.com/ray-project/kuberay/pull/497/