The Ray Operator is a Kubernetes operator to automate provisioning, management, autoscaling and operations of Ray clusters deployed to Kubernetes.
Some of the main features of Ray-Operator are:
- user management via CRD
- heterogeneous pods in one Ray cluster with specific affinity, toleration and other pre-defined settings
- monitoring via Prometheus
- HA for Ray Kubernetes Operator, there will be a lead election if lead crashes
## File structure:
> ```
> ray/deploy/ray-operator
> ├── api/v1alpha1 // Package v1alpha1 contains API Schema definitions for the ray v1alpha1 API group
> │ ├── groupversion_info.go // contains common metadata about the group-version
> │ ├── raycluster_types.go // RayCluster field definitions, user should focus
> │ └── zz_generated.deepcopy.go // contains the autogenerated implementation of the aforementioned runtime.Object interface, which marks all of our root types as representing Kinds.
> │ ├── default // contains a Kustomize base for launching the controller in a standard configuration.
> │ │ ├── kustomization.yaml
> │ │ ├── manager_auth_proxy_patch.yaml // inject a sidecar container which is a HTTP proxy for the controller manager, it performs RBAC authorization against the Kubernetes API using SubjectAccessReviews.
To introduce the Ray-Operator, give 3 samples of RayCluster CR.
Sample | desc
------------- | -------------
[RayCluster.mini.yaml](config/samples/ray_v1_raycluster.mini.yaml) | 2 pods in this sample, 1 for head and 1 for workers.The least information to start ray cluster, run in local test.
[RayCluster.heterogeneous.yaml](config/samples/ray_v1_raycluster.heterogeneous.yaml) | 3 pods in this sample, 1 for head and 2 for workers but with different specifications. Different quota(like CPU/MEMORY) compares to mini version, run in local test.
[RayCluster.complete.yaml](config/samples/ray_v1_raycluster.complete.yaml) | a complete version CR for customized requirement, show how to set Customized props. More props to set compares to heterogeneous version, run in production.
## RayCluster CRD
Refers to file [raycluster_types.go](api/v1alpha1/raycluster_types.go) for code details.
If interested in CRD, refer to file [CRD](config/crd/bases/ray.io_rayclusters.yaml) for more details.
Also you will need kubeconfig in ~/.kube/config, so you can access to Kubernetes Cluster.
## Get started
Below gives a guide for user to submit RayCluster step by step:
### Install CRDs into a cluster
```shell script
kustomize build config/crd | kubectl apply -f -
```
### Deploy controller in the configured Kubernetes cluster in ~/.kube/config
* For this version controller will run in system namespace, which maybe can't be tolerated in production.
* We will add more detailed RBAC file to control the namespace used in production, and the controller will run in that namespace to control the permission.
* Also we will provide the more detailed guide for user to run in a controlled way.