ray/doc/source/cluster/kuberay.md
Dmitri Gekhtman 413fe08f87
Move KubeRay autoscaler files into Ray autoscaler directory, add an entry-point. (#22847)
This PR consists of the following clean-up items for KubeRay autoscaler integration:

Remove the docker/kuberay directory

Move the Python files formerly in docker/kuberay to the autoscaler directory.

Use a rayproject/ray image for the autoscaler.

Add an entry point for the kuberay autoscaler to scripts.py. Use the entry point in the example config.

Slightly simplify the code that starts the autoscaler.

Ray versions are updated to Ray 1.11.0, which will be officially released within the next couple of days.

By default, Ray >= 1.11.0 runs without Redis. References to Redis are removed from the example config.

Add the autoscaler configuration test to the CI.

Update development documentation to reflect the changes in this PR.
2022-03-09 18:26:57 -08:00

3.7 KiB

Deploying with KubeRay (experimental)

[KubeRay](https://github.com/ray-project/kuberay) is a set of tools for running Ray on Kubernetes.
It has been used by some larger corporations to deploy Ray on their infrastructure.
Going forward, we would like to make this way of deployment accessible and seamless for
all Ray users and standardize Ray deployment on Kubernetes around KubeRay's operator.
Presently you should consider this integration a minimal viable product that is not polished
enough for general use and prefer the [Kubernetes integration](kubernetes.rst) for running
Ray on Kubernetes. If you are brave enough to try the KubeRay integration out, this documentation
is for you! We would love your feedback as a [Github issue](https://github.com/ray-project/ray/issues)
including `[KubeRay]` in the title.

Here we describe how you can deploy a Ray cluster on KubeRay. The following instructions are for Minikube but the deployment works the same way on a real Kubernetes cluster. You need to have at least 4 CPUs to run this example. First we make sure Minikube is initialized with

minikube start

Now you can deploy the KubeRay operator using

./ray/python/ray/autoscaler/kuberay/init-config.sh
kubectl apply -k "ray/python/ray/autoscaler/kuberay/config/default"
kubectl apply -f "ray/python/ray/autoscaler/kuberay/kuberay-autoscaler-rbac.yaml"

You can verify that the operator has been deployed using

kubectl -n ray-system get pods

Now let's deploy a new Ray cluster:

kubectl create -f ray/python/ray/autoscaler/kuberay/ray-cluster.complete.yaml

Using the autoscaler

Let's now try out the autoscaler. We can run the following command to get a Python interpreter in the head pod:

kubectl exec `kubectl get pods -o custom-columns=POD:metadata.name | grep raycluster-complete-head` -it -c ray-head -- python

In the Python interpreter, run the following snippet to scale up the cluster:

import ray.autoscaler.sdk
ray.init("auto")
ray.autoscaler.sdk.request_resources(num_cpus=4)

Uninstalling the KubeRay operator

You can uninstall the KubeRay operator using

kubectl delete -f "ray/python/ray/autoscaler/kuberay/kuberay-autoscaler-rbac.yaml"
kubectl delete -k "ray/python/ray/autoscaler/kuberay/config/default"

Note that all running Ray clusters will automatically be terminated.

Developing the KubeRay integration (advanced)

Developing the KubeRay operator

If you also want to change the underlying KubeRay operator, please refer to the instructions in the KubeRay development documentation. In that case you should push the modified operator to your docker account or registry and follow the instructions in ray/python/ray/autoscaler/kuberay/init-config.sh.

Developing the Ray autoscaler code

Code for the Ray autoscaler's KubeRay integration is located in ray/python/ray/autoscaler/_private/kuberay.

Here is one procedure to test development autoscaler code.

  1. Push autoscaler code changes to your fork of Ray.
  2. Use the following Dockerfile to build an image with your changes.
# Use the latest Ray master as base.
FROM rayproject/ray:nightly
# Retrieve your development code.
RUN git clone -b <my-dev-branch> https://github.com/<my-git-handle>/ray
# Install symlinks to your modified Python code.
RUN python ray/python/ray/setup-dev.py -y
  1. Push the image to your docker account or registry.
  2. Update the autoscaler image in ray-cluster.complete.yaml

Refer to the Ray development documentation for further details.