mirror of
https://github.com/vale981/ray
synced 2025-03-05 10:01:43 -05:00
[clusters][docs] Provide urls to content, fix typos (#27936)
This commit is contained in:
parent
a08ed6ba75
commit
17a8db048f
6 changed files with 35 additions and 63 deletions
|
@ -55,10 +55,7 @@ Broadly speaking, it is more efficient to use a few large Ray pods than many sma
|
||||||
|
|
||||||
We recommend taking a look at the [config file][ConfigLink] applied in the following command.
|
We recommend taking a look at the [config file][ConfigLink] applied in the following command.
|
||||||
```shell
|
```shell
|
||||||
# Starting from the parent directory of cloned Ray master,
|
kubectl apply -f https://raw.githubusercontent.com/ray-project/ray/releases/2.0.0/doc/source/cluster/kubernetes/configs/xgboost-benchmark.yaml
|
||||||
pushd ray/doc/source/cluster/kubernetes/configs/
|
|
||||||
kubectl apply -f xgboost-benchmark.yaml
|
|
||||||
popd
|
|
||||||
```
|
```
|
||||||
|
|
||||||
A Ray head pod and 9 Ray worker pods will be created.
|
A Ray head pod and 9 Ray worker pods will be created.
|
||||||
|
@ -66,7 +63,7 @@ A Ray head pod and 9 Ray worker pods will be created.
|
||||||
|
|
||||||
```{admonition} Optional: Deploying an autoscaling Ray cluster
|
```{admonition} Optional: Deploying an autoscaling Ray cluster
|
||||||
If you've set up an autoscaling node group or pool, you may wish to deploy
|
If you've set up an autoscaling node group or pool, you may wish to deploy
|
||||||
an autoscaling cluster by applying the config `xgboost-benchmark-autoscaler.yaml`.
|
an autoscaling cluster by applying the config [xgboost-benchmark-autoscaler.yaml][ConfigLinkAutoscaling].
|
||||||
One Ray head pod will be created. Once the workload starts, the Ray autoscaler will trigger
|
One Ray head pod will be created. Once the workload starts, the Ray autoscaler will trigger
|
||||||
creation of Ray worker pods. Kubernetes autoscaling will then create nodes to place the Ray pods.
|
creation of Ray worker pods. Kubernetes autoscaling will then create nodes to place the Ray pods.
|
||||||
```
|
```
|
||||||
|
@ -100,13 +97,13 @@ We'll use the {ref}`Ray Job Python SDK <ray-job-sdk>` to submit the XGBoost work
|
||||||
```
|
```
|
||||||
|
|
||||||
To submit the workload, run the above Python script.
|
To submit the workload, run the above Python script.
|
||||||
The script is available in the Ray repository.
|
The script is available [in the Ray repository][XGBSubmit].
|
||||||
|
|
||||||
```shell
|
```shell
|
||||||
# From the parent directory of cloned Ray master.
|
# Download the above script.
|
||||||
pushd ray/doc/source/cluster/doc_code/
|
curl https://raw.githubusercontent.com/ray-project/ray/releases/2.0.0/doc/source/cluster/doc_code/xgboost_submit.py -o xgboost_submit.py
|
||||||
|
# Run the script.
|
||||||
python xgboost_submit.py
|
python xgboost_submit.py
|
||||||
popd
|
|
||||||
```
|
```
|
||||||
|
|
||||||
### Observe progress.
|
### Observe progress.
|
||||||
|
@ -118,7 +115,7 @@ Use the following tools to observe its progress.
|
||||||
|
|
||||||
To follow the job's logs, use the command printed by the above submission script.
|
To follow the job's logs, use the command printed by the above submission script.
|
||||||
```shell
|
```shell
|
||||||
# Subsitute the Ray Job's submission id.
|
# Substitute the Ray Job's submission id.
|
||||||
ray job logs 'raysubmit_xxxxxxxxxxxxxxxx' --follow
|
ray job logs 'raysubmit_xxxxxxxxxxxxxxxx' --follow
|
||||||
```
|
```
|
||||||
|
|
||||||
|
@ -184,6 +181,6 @@ kubectl delete raycluster raycluster-xgboost-benchmark
|
||||||
If you're on a public cloud, don't forget to clean up the underlying
|
If you're on a public cloud, don't forget to clean up the underlying
|
||||||
node group and/or Kubernetes cluster.
|
node group and/or Kubernetes cluster.
|
||||||
|
|
||||||
<!-- TODO: Fix this -->
|
[ConfigLink]:https://raw.githubusercontent.com/ray-project/ray/releases/2.0.0/doc/source/cluster/kubernetes/configs/xgboost-benchmark.yaml
|
||||||
<!-- [ConfigLink]: https://raw.githubusercontent.com/ray-project/ray/291bba69fb90ee5e8401540ef55b7b74dd13f5c5/doc/source/cluster/ray-clusters-on-kubernetes/configs/xgboost-benchmark-autoscaler.yaml -->
|
[ConfigLinkAutoscaling]: https://raw.githubusercontent.com/ray-project/ray/releases/2.0.0/doc/source/cluster/kubernetes/configs/xgboost-benchmark-autoscaler.yaml
|
||||||
[ConfigLink]: https://github.com/ray-project/ray/tree/master/doc/source/cluster/
|
[XGBSubmit]: https://github.com/ray-project/ray/blob/releases/2.0.0/doc/source/cluster/doc_code/xgboost_submit.py
|
||||||
|
|
|
@ -79,7 +79,7 @@
|
||||||
"(kuberay-operator-deploy)=\n",
|
"(kuberay-operator-deploy)=\n",
|
||||||
"## Deploying the KubeRay operator\n",
|
"## Deploying the KubeRay operator\n",
|
||||||
"\n",
|
"\n",
|
||||||
"Deploy the KubeRay Operator by cloning the KubeRay repo and applying the relevant configuration files from the master branch. "
|
"Deploy the KubeRay Operator by applying the relevant configuration files from the KubeRay GitHub repo. "
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
|
@ -89,12 +89,12 @@
|
||||||
"metadata": {},
|
"metadata": {},
|
||||||
"outputs": [],
|
"outputs": [],
|
||||||
"source": [
|
"source": [
|
||||||
"! git clone https://github.com/ray-project/kuberay -b release-0.3\n",
|
|
||||||
"\n",
|
|
||||||
"# This creates the KubeRay operator and all of the resources it needs.\n",
|
"# This creates the KubeRay operator and all of the resources it needs.\n",
|
||||||
"! kubectl create -k kuberay/ray-operator/config/default\n",
|
"! kubectl create -k \"github.com/ray-project/kuberay/ray-operator/config/default?ref=v0.3.0&timeout=90s\"\n",
|
||||||
"\n",
|
"\n",
|
||||||
"# Note that we must use \"kubectl create\" in the above command. \"kubectl apply\" will not work due to https://github.com/ray-project/kuberay/issues/271"
|
"# Note that we must use \"kubectl create\" in the above command. \"kubectl apply\" will not work due to https://github.com/ray-project/kuberay/issues/271\n",
|
||||||
|
"\n",
|
||||||
|
"# You may alternatively clone the KubeRay GitHub repo and deploy the operator's configuration from your local file system."
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
|
@ -156,8 +156,8 @@
|
||||||
"metadata": {},
|
"metadata": {},
|
||||||
"outputs": [],
|
"outputs": [],
|
||||||
"source": [
|
"source": [
|
||||||
"# Deploy the Ray Cluster CR:\n",
|
"# Deploy a sample Ray Cluster CR from the KubeRay repo:\n",
|
||||||
"! kubectl apply -f kuberay/ray-operator/config/samples/ray-cluster.autoscaler.yaml\n",
|
"! kubectl apply -f https://raw.githubusercontent.com/ray-project/kuberay/release-0.3/ray-operator/config/samples/ray-cluster.autoscaler.yaml\n",
|
||||||
"\n",
|
"\n",
|
||||||
"# This Ray cluster is named `raycluster-autoscaler` because it has optional Ray Autoscaler support enabled."
|
"# This Ray cluster is named `raycluster-autoscaler` because it has optional Ray Autoscaler support enabled."
|
||||||
]
|
]
|
||||||
|
@ -306,8 +306,7 @@
|
||||||
"execution_count": 2,
|
"execution_count": 2,
|
||||||
"id": "d3dae5fd",
|
"id": "d3dae5fd",
|
||||||
"metadata": {},
|
"metadata": {},
|
||||||
"outputs": [
|
"outputs": [],
|
||||||
],
|
|
||||||
"source": [
|
"source": [
|
||||||
"! kubectl get service raycluster-autoscaler-head-svc\n",
|
"! kubectl get service raycluster-autoscaler-head-svc\n",
|
||||||
"\n",
|
"\n",
|
||||||
|
@ -379,7 +378,7 @@
|
||||||
"\n",
|
"\n",
|
||||||
"### Deleting a Ray Cluster\n",
|
"### Deleting a Ray Cluster\n",
|
||||||
"\n",
|
"\n",
|
||||||
"To delete the Ray Cluster we deployed in this example, you can run either of the following commands."
|
"To delete the Ray Cluster we deployed in this example, run the following command."
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
|
@ -393,25 +392,6 @@
|
||||||
"! kubectl delete raycluster raycluster-autoscaler"
|
"! kubectl delete raycluster raycluster-autoscaler"
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
{
|
|
||||||
"cell_type": "markdown",
|
|
||||||
"id": "d7aa0221",
|
|
||||||
"metadata": {},
|
|
||||||
"source": [
|
|
||||||
"**OR**"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "code",
|
|
||||||
"execution_count": null,
|
|
||||||
"id": "112e6d2e",
|
|
||||||
"metadata": {},
|
|
||||||
"outputs": [],
|
|
||||||
"source": [
|
|
||||||
"# Delete by reference to the yaml file we used to define the RayCluster CR \n",
|
|
||||||
"! kubectl delete -f kuberay/ray-operator/config/samples/ray-cluster.autoscaler.yaml"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
{
|
||||||
"cell_type": "markdown",
|
"cell_type": "markdown",
|
||||||
"id": "0de87d9d",
|
"id": "0de87d9d",
|
||||||
|
@ -450,7 +430,7 @@
|
||||||
"metadata": {},
|
"metadata": {},
|
||||||
"outputs": [],
|
"outputs": [],
|
||||||
"source": [
|
"source": [
|
||||||
"! kubectl delete -k kuberay/ray-operator/config/default"
|
"! kubectl delete -k \"github.com/ray-project/kuberay/ray-operator/config/default?ref=v0.3.0&timeout=90s\""
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
|
@ -490,7 +470,7 @@
|
||||||
"name": "python",
|
"name": "python",
|
||||||
"nbconvert_exporter": "python",
|
"nbconvert_exporter": "python",
|
||||||
"pygments_lexer": "ipython3",
|
"pygments_lexer": "ipython3",
|
||||||
"version": "3.7.11"
|
"version": "3.8.10"
|
||||||
}
|
}
|
||||||
},
|
},
|
||||||
"nbformat": 4,
|
"nbformat": 4,
|
||||||
|
|
|
@ -44,13 +44,12 @@ First, follow the [quickstart guide](kuberay-quickstart) to create an autoscalin
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
# Optionally use kind to run the examples locally.
|
# Optionally use kind to run the examples locally.
|
||||||
# kind create cluster
|
# $ kind create cluster
|
||||||
|
|
||||||
$ git clone https://github.com/ray-project/kuberay -b release-0.3
|
|
||||||
# Create the KubeRay operator.
|
# Create the KubeRay operator.
|
||||||
$ kubectl create -k kuberay/ray-operator/config/default
|
$ kubectl create -k "github.com/ray-project/kuberay/ray-operator/config/default?ref=v0.3.0&timeout=90s"
|
||||||
# Create an autoscaling Ray cluster.
|
# Create an autoscaling Ray cluster.
|
||||||
$ kubectl apply -f kuberay/ray-operator/config/samples/ray-cluster.autoscaler.yaml
|
$ kubectl apply -f https://raw.githubusercontent.com/ray-project/kuberay/release-0.3/ray-operator/config/samples/ray-cluster.autoscaler.yaml
|
||||||
```
|
```
|
||||||
|
|
||||||
Now, we can run a Ray program on the head pod that uses [``request_resources``](ref-autoscaler-sdk) to scale the cluster to a total of 3 CPUs. The head and worker pods in our [example cluster config](https://github.com/ray-project/kuberay/blob/master/ray-operator/config/samples/ray-cluster.autoscaler.yaml) each have a capacity of 1 CPU, and we specified a minimum of 1 worker pod. Thus, the request should trigger upscaling of one additional worker pod.
|
Now, we can run a Ray program on the head pod that uses [``request_resources``](ref-autoscaler-sdk) to scale the cluster to a total of 3 CPUs. The head and worker pods in our [example cluster config](https://github.com/ray-project/kuberay/blob/master/ray-operator/config/samples/ray-cluster.autoscaler.yaml) each have a capacity of 1 CPU, and we specified a minimum of 1 worker pod. Thus, the request should trigger upscaling of one additional worker pod.
|
||||||
|
@ -66,7 +65,7 @@ $ kubectl get pods --selector=ray.io/cluster=raycluster-autoscaler --selector=ra
|
||||||
|
|
||||||
Then, we can run the Ray program using ``kubectl exec``:
|
Then, we can run the Ray program using ``kubectl exec``:
|
||||||
```bash
|
```bash
|
||||||
$ kubectl exec raycluster-autoscaler-head-xxxxx -it -c ray-head -- python -c \"import ray; ray.init(); ray.autoscaler.sdk.request_resources(num_cpus=3)
|
$ kubectl exec raycluster-autoscaler-head-xxxxx -it -c ray-head -- python -c "import ray; ray.init(); ray.autoscaler.sdk.request_resources(num_cpus=3)"
|
||||||
```
|
```
|
||||||
|
|
||||||
The last command should have triggered Ray pod upscaling. To confirm the new worker pod is up, let's query the RayCluster's pods again:
|
The last command should have triggered Ray pod upscaling. To confirm the new worker pod is up, let's query the RayCluster's pods again:
|
||||||
|
|
|
@ -122,10 +122,7 @@ for instructions on this step.
|
||||||
Now, run the following commands to deploy the Fluent Bit ConfigMap and a single-pod RayCluster with
|
Now, run the following commands to deploy the Fluent Bit ConfigMap and a single-pod RayCluster with
|
||||||
a Fluent Bit sidecar.
|
a Fluent Bit sidecar.
|
||||||
```shell
|
```shell
|
||||||
# Starting from the parent of cloned Ray master.
|
kubectl apply -f https://raw.githubusercontent.com/ray-project/ray/releases/2.0.0/doc/source/cluster/kubernetes/configs/ray-cluster.log.yaml
|
||||||
pushd ray/doc/source/cluster/kubernetes/configs/
|
|
||||||
kubectl apply -f ray-cluster.log.yaml
|
|
||||||
popd
|
|
||||||
```
|
```
|
||||||
|
|
||||||
Determine the Ray pod's name with
|
Determine the Ray pod's name with
|
||||||
|
@ -145,6 +142,4 @@ kubectl logs raycluster-complete-logs-head-xxxxx -c fluentbit
|
||||||
[Fluentd]: https://docs.fluentd.org/
|
[Fluentd]: https://docs.fluentd.org/
|
||||||
[Promtail]: https://grafana.com/docs/loki/latest/clients/promtail/
|
[Promtail]: https://grafana.com/docs/loki/latest/clients/promtail/
|
||||||
[KubDoc]: https://kubernetes.io/docs/concepts/cluster-administration/logging/
|
[KubDoc]: https://kubernetes.io/docs/concepts/cluster-administration/logging/
|
||||||
<!-- TODO: fix this -->
|
[ConfigLink]: https://raw.githubusercontent.com/ray-project/ray/releases/2.0.0/doc/source/cluster/kubernetes/configs/ray-cluster.log.yaml
|
||||||
[ConfigLink]: https://github.com/ray-project/ray/tree/master/doc/source/cluster/
|
|
||||||
<!-- [ConfigLink]: https://raw.githubusercontent.com/ray-project/ray/779e9f7c5733ef9a471ad2bb61723158ff942e92/doc/source/cluster/ray-clusters-on-kubernetes/configs/ray-cluster.log.yaml -->
|
|
||||||
|
|
|
@ -70,13 +70,13 @@ We'll use the {ref}`Ray Job Python SDK <ray-job-sdk>` to submit the XGBoost work
|
||||||
```
|
```
|
||||||
|
|
||||||
To submit the workload, run the above Python script.
|
To submit the workload, run the above Python script.
|
||||||
The script is also available in the Ray repository.
|
The script is available [in the Ray repository][XGBSubmit].
|
||||||
|
|
||||||
```shell
|
```shell
|
||||||
# From the parent directory of cloned Ray master.
|
# Download the above script.
|
||||||
pushd ray/doc/source/cluster/doc_code/
|
curl https://raw.githubusercontent.com/ray-project/ray/releases/2.0.0/doc/source/cluster/doc_code/xgboost_submit.py -o xgboost_submit.py
|
||||||
|
# Run the script.
|
||||||
python xgboost_submit.py
|
python xgboost_submit.py
|
||||||
popd
|
|
||||||
```
|
```
|
||||||
|
|
||||||
### Observe progress
|
### Observe progress
|
||||||
|
@ -130,3 +130,5 @@ Delete your Ray cluster with the following command:
|
||||||
```shell
|
```shell
|
||||||
ray down -y cluster.yaml
|
ray down -y cluster.yaml
|
||||||
```
|
```
|
||||||
|
|
||||||
|
[XGBSubmit]: https://github.com/ray-project/ray/blob/releases/2.0.0/doc/source/cluster/doc_code/xgboost_submit.py
|
||||||
|
|
|
@ -151,9 +151,8 @@ class StandardAutoscaler:
|
||||||
`ray start --head --autoscaling-config=/path/to/config.yaml` on a instance
|
`ray start --head --autoscaling-config=/path/to/config.yaml` on a instance
|
||||||
that has permission to launch other instances, or you can also use `ray up
|
that has permission to launch other instances, or you can also use `ray up
|
||||||
/path/to/config.yaml` from your laptop, which will configure the right
|
/path/to/config.yaml` from your laptop, which will configure the right
|
||||||
AWS/Cloud roles automatically. See the documentation for a full definition
|
AWS/Cloud roles automatically. See the Ray documentation
|
||||||
of autoscaling behavior:
|
(https://docs.ray.io/en/latest/) for a full definition of autoscaling behavior.
|
||||||
https://docs.ray.io/en/master/cluster/autoscaling.html
|
|
||||||
StandardAutoscaler's `update` method is periodically called in
|
StandardAutoscaler's `update` method is periodically called in
|
||||||
`monitor.py`'s monitoring loop.
|
`monitor.py`'s monitoring loop.
|
||||||
|
|
||||||
|
|
Loading…
Add table
Reference in a new issue