mirror of
https://github.com/vale981/ray
synced 2025-03-08 19:41:38 -05:00
130 lines
6.3 KiB
Markdown
130 lines
6.3 KiB
Markdown
![]() |
# Ray Kubernetes Operator
|
||
|
|
||
|
The Ray Operator is a Kubernetes operator to automate provisioning, management, autoscaling and operations of Ray clusters deployed to Kubernetes.
|
||
|
|
||
|
Some of the main features of Ray-Operator are:
|
||
|
- user management via CRD
|
||
|
- heterogeneous pods in one Ray cluster with specific affinity, toleration and other pre-defined settings
|
||
|
- monitoring via Prometheus
|
||
|
- HA for Ray Kubernetes Operator, there will be a lead election if lead crashes
|
||
|
|
||
|
## File structure:
|
||
|
> ```
|
||
|
> ray/deploy/ray-operator
|
||
|
> ├── api/v1alpha1 // Package v1alpha1 contains API Schema definitions for the ray v1alpha1 API group
|
||
|
> │ ├── groupversion_info.go // contains common metadata about the group-version
|
||
|
> │ ├── raycluster_types.go // RayCluster field definitions, user should focus
|
||
|
> │ └── zz_generated.deepcopy.go // contains the autogenerated implementation of the aforementioned runtime.Object interface, which marks all of our root types as representing Kinds.
|
||
|
> │
|
||
|
> └── config // contains Kustomize YAML definitions required to launch our controller on a cluster,hold our CustomResourceDefinitions, RBAC configuration, and WebhookConfigurations.
|
||
|
> ├── certmanager
|
||
|
> │ ├── certificate.yaml // The following manifests contain a self-signed issuer CR and a certificate CR.
|
||
|
> │ ├── kustomization.yaml
|
||
|
> │ └── kustomizeconfig.yaml
|
||
|
> │
|
||
|
> ├── crd
|
||
|
> │ └── bases
|
||
|
> │ │ └── ray.io_rayclusters.yaml // RayCluster CRD yaml file
|
||
|
> │ └── patches
|
||
|
> │ │ ├── cainjection_in_rayclusters.yaml // adds a directive for certmanager to inject CA into the CRD
|
||
|
> │ │ └── webhook_in_rayclusters.yaml // enables conversion webhook for CRD
|
||
|
> │ │── kustomization.yaml
|
||
|
> │ └── kustomizeconfig.yaml
|
||
|
> │
|
||
|
> ├── default // contains a Kustomize base for launching the controller in a standard configuration.
|
||
|
> │ ├── kustomization.yaml
|
||
|
> │ ├── manager_auth_proxy_patch.yaml // inject a sidecar container which is a HTTP proxy for the controller manager, it performs RBAC authorization against the Kubernetes API using SubjectAccessReviews.
|
||
|
> │ ├── manager_webhook_patch.yaml // webhook yaml file
|
||
|
> │ └── webhookcainjection_patch.yaml // add annotation to admission webhook config
|
||
|
> │
|
||
|
> ├── manager // launch your controllers as pods in the cluster.
|
||
|
> │ ├── kustomization.yaml
|
||
|
> │ └── manager.yaml // manager yaml to create controller deployment, user should focus
|
||
|
> │
|
||
|
> ├── prometheus
|
||
|
> │ ├── kustomization.yaml
|
||
|
> │ └── monitor.yaml // Prometheus Monitor Service, user should focus
|
||
|
> │
|
||
|
> ├── rbac // permissions required to run your controllers under their own service account.
|
||
|
> │ ├── auth_proxy_role.yaml
|
||
|
> │ ├── auth_proxy_role_binding.yaml
|
||
|
> │ ├── auth_proxy_service.yaml
|
||
|
> │ ├── kustomization.yaml
|
||
|
> │ ├── leader_election_role.yaml // permissions to do leader election.
|
||
|
> │ ├── leader_election_role_binding.yaml
|
||
|
> │ └── role_binding.yaml
|
||
|
> │
|
||
|
> ├── samples // sample RayCluster yaml, user should focus
|
||
|
> │ ├── ray_v1_raycluster.complete.yaml
|
||
|
> │ ├── ray_v1_raycluster.heterogeneous.yaml
|
||
|
> │ └── ray_v1_raycluster.mini.yaml
|
||
|
> │
|
||
|
> └── webhook
|
||
|
> ├── kustomization.yaml
|
||
|
> ├── kustomizeconfig.yaml
|
||
|
> ├── manifests.yaml
|
||
|
> └── service.yaml // webhook-service
|
||
|
> ```
|
||
|
|
||
|
## RayCluster sample CR
|
||
|
|
||
|
To introduce the Ray-Operator, give 3 samples of RayCluster CR.
|
||
|
|
||
|
Sample | desc
|
||
|
------------- | -------------
|
||
|
[RayCluster.mini.yaml](config/samples/ray_v1_raycluster.mini.yaml) | 2 pods in this sample, 1 for head and 1 for workers.The least information to start ray cluster, run in local test.
|
||
|
[RayCluster.heterogeneous.yaml](config/samples/ray_v1_raycluster.heterogeneous.yaml) | 3 pods in this sample, 1 for head and 2 for workers but with different specifications. Different quota(like CPU/MEMORY) compares to mini version, run in local test.
|
||
|
[RayCluster.complete.yaml](config/samples/ray_v1_raycluster.complete.yaml) | a complete version CR for customized requirement, show how to set Customized props. More props to set compares to heterogeneous version, run in production.
|
||
|
|
||
|
## RayCluster CRD
|
||
|
|
||
|
Refers to file [raycluster_types.go](api/v1alpha1/raycluster_types.go) for code details.
|
||
|
|
||
|
If interested in CRD, refer to file [CRD](config/crd/bases/ray.io_rayclusters.yaml) for more details.
|
||
|
|
||
|
## Software requirement
|
||
|
Take care some software have dependency.
|
||
|
|
||
|
software | version | memo
|
||
|
:------------- | :---------------:| -------------:
|
||
|
kustomize | v3.1.0+ | [download](https://github.com/kubernetes-sigs/kustomize)
|
||
|
kubectl | v1.11.3+ | [download](https://kubernetes.io/docs/tasks/tools/install-kubectl/)
|
||
|
Kubernetes Cluster | Access to a Kubernetes v1.11.3+ cluster| [Minikube](https://github.com/kubernetes/minikube) for local test
|
||
|
go | v1.13+|[download](https://golang.org/dl/)
|
||
|
docker | 17.03+|[download](https://docs.docker.com/install/)
|
||
|
|
||
|
Also you will need kubeconfig in ~/.kube/config, so you can access to Kubernetes Cluster.
|
||
|
|
||
|
## Get started
|
||
|
Below gives a guide for user to submit RayCluster step by step:
|
||
|
|
||
|
### Install CRDs into a cluster
|
||
|
|
||
|
```shell script
|
||
|
kustomize build config/crd | kubectl apply -f -
|
||
|
```
|
||
|
|
||
|
### Deploy controller in the configured Kubernetes cluster in ~/.kube/config
|
||
|
* For this version controller will run in system namespace, which maybe can't be tolerated in production.
|
||
|
* We will add more detailed RBAC file to control the namespace used in production, and the controller will run in that namespace to control the permission.
|
||
|
* Also we will provide the more detailed guide for user to run in a controlled way.
|
||
|
```shell script
|
||
|
cd config/manager
|
||
|
kustomize build config/default | kubectl apply -f -
|
||
|
```
|
||
|
|
||
|
### Submit RayCluster to Kubernetes
|
||
|
```shell script
|
||
|
kubectl create -f config/samples/ray_v1_raycluster.mini.yaml
|
||
|
```
|
||
|
|
||
|
### Apply RayCluster to Kubernetes
|
||
|
```shell script
|
||
|
kubectl apply -f config/samples/ray_v1_raycluster.mini.yaml
|
||
|
```
|
||
|
|
||
|
### Delete RayCluster to Kubernetes
|
||
|
```shell script
|
||
|
kubectl delete -f config/samples/ray_v1_raycluster.mini.yaml
|
||
|
```
|