mirror of
https://github.com/vale981/ray
synced 2025-03-05 10:01:43 -05:00
[clusters][docs] Provide urls to content, fix typos (#27936)
This commit is contained in:
parent
a08ed6ba75
commit
17a8db048f
6 changed files with 35 additions and 63 deletions
|
@ -55,10 +55,7 @@ Broadly speaking, it is more efficient to use a few large Ray pods than many sma
|
|||
|
||||
We recommend taking a look at the [config file][ConfigLink] applied in the following command.
|
||||
```shell
|
||||
# Starting from the parent directory of cloned Ray master,
|
||||
pushd ray/doc/source/cluster/kubernetes/configs/
|
||||
kubectl apply -f xgboost-benchmark.yaml
|
||||
popd
|
||||
kubectl apply -f https://raw.githubusercontent.com/ray-project/ray/releases/2.0.0/doc/source/cluster/kubernetes/configs/xgboost-benchmark.yaml
|
||||
```
|
||||
|
||||
A Ray head pod and 9 Ray worker pods will be created.
|
||||
|
@ -66,7 +63,7 @@ A Ray head pod and 9 Ray worker pods will be created.
|
|||
|
||||
```{admonition} Optional: Deploying an autoscaling Ray cluster
|
||||
If you've set up an autoscaling node group or pool, you may wish to deploy
|
||||
an autoscaling cluster by applying the config `xgboost-benchmark-autoscaler.yaml`.
|
||||
an autoscaling cluster by applying the config [xgboost-benchmark-autoscaler.yaml][ConfigLinkAutoscaling].
|
||||
One Ray head pod will be created. Once the workload starts, the Ray autoscaler will trigger
|
||||
creation of Ray worker pods. Kubernetes autoscaling will then create nodes to place the Ray pods.
|
||||
```
|
||||
|
@ -100,13 +97,13 @@ We'll use the {ref}`Ray Job Python SDK <ray-job-sdk>` to submit the XGBoost work
|
|||
```
|
||||
|
||||
To submit the workload, run the above Python script.
|
||||
The script is available in the Ray repository.
|
||||
The script is available [in the Ray repository][XGBSubmit].
|
||||
|
||||
```shell
|
||||
# From the parent directory of cloned Ray master.
|
||||
pushd ray/doc/source/cluster/doc_code/
|
||||
# Download the above script.
|
||||
curl https://raw.githubusercontent.com/ray-project/ray/releases/2.0.0/doc/source/cluster/doc_code/xgboost_submit.py -o xgboost_submit.py
|
||||
# Run the script.
|
||||
python xgboost_submit.py
|
||||
popd
|
||||
```
|
||||
|
||||
### Observe progress.
|
||||
|
@ -118,7 +115,7 @@ Use the following tools to observe its progress.
|
|||
|
||||
To follow the job's logs, use the command printed by the above submission script.
|
||||
```shell
|
||||
# Subsitute the Ray Job's submission id.
|
||||
# Substitute the Ray Job's submission id.
|
||||
ray job logs 'raysubmit_xxxxxxxxxxxxxxxx' --follow
|
||||
```
|
||||
|
||||
|
@ -184,6 +181,6 @@ kubectl delete raycluster raycluster-xgboost-benchmark
|
|||
If you're on a public cloud, don't forget to clean up the underlying
|
||||
node group and/or Kubernetes cluster.
|
||||
|
||||
<!-- TODO: Fix this -->
|
||||
<!-- [ConfigLink]: https://raw.githubusercontent.com/ray-project/ray/291bba69fb90ee5e8401540ef55b7b74dd13f5c5/doc/source/cluster/ray-clusters-on-kubernetes/configs/xgboost-benchmark-autoscaler.yaml -->
|
||||
[ConfigLink]: https://github.com/ray-project/ray/tree/master/doc/source/cluster/
|
||||
[ConfigLink]:https://raw.githubusercontent.com/ray-project/ray/releases/2.0.0/doc/source/cluster/kubernetes/configs/xgboost-benchmark.yaml
|
||||
[ConfigLinkAutoscaling]: https://raw.githubusercontent.com/ray-project/ray/releases/2.0.0/doc/source/cluster/kubernetes/configs/xgboost-benchmark-autoscaler.yaml
|
||||
[XGBSubmit]: https://github.com/ray-project/ray/blob/releases/2.0.0/doc/source/cluster/doc_code/xgboost_submit.py
|
||||
|
|
|
@ -79,7 +79,7 @@
|
|||
"(kuberay-operator-deploy)=\n",
|
||||
"## Deploying the KubeRay operator\n",
|
||||
"\n",
|
||||
"Deploy the KubeRay Operator by cloning the KubeRay repo and applying the relevant configuration files from the master branch. "
|
||||
"Deploy the KubeRay Operator by applying the relevant configuration files from the KubeRay GitHub repo. "
|
||||
]
|
||||
},
|
||||
{
|
||||
|
@ -89,12 +89,12 @@
|
|||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"! git clone https://github.com/ray-project/kuberay -b release-0.3\n",
|
||||
"\n",
|
||||
"# This creates the KubeRay operator and all of the resources it needs.\n",
|
||||
"! kubectl create -k kuberay/ray-operator/config/default\n",
|
||||
"! kubectl create -k \"github.com/ray-project/kuberay/ray-operator/config/default?ref=v0.3.0&timeout=90s\"\n",
|
||||
"\n",
|
||||
"# Note that we must use \"kubectl create\" in the above command. \"kubectl apply\" will not work due to https://github.com/ray-project/kuberay/issues/271"
|
||||
"# Note that we must use \"kubectl create\" in the above command. \"kubectl apply\" will not work due to https://github.com/ray-project/kuberay/issues/271\n",
|
||||
"\n",
|
||||
"# You may alternatively clone the KubeRay GitHub repo and deploy the operator's configuration from your local file system."
|
||||
]
|
||||
},
|
||||
{
|
||||
|
@ -156,8 +156,8 @@
|
|||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"# Deploy the Ray Cluster CR:\n",
|
||||
"! kubectl apply -f kuberay/ray-operator/config/samples/ray-cluster.autoscaler.yaml\n",
|
||||
"# Deploy a sample Ray Cluster CR from the KubeRay repo:\n",
|
||||
"! kubectl apply -f https://raw.githubusercontent.com/ray-project/kuberay/release-0.3/ray-operator/config/samples/ray-cluster.autoscaler.yaml\n",
|
||||
"\n",
|
||||
"# This Ray cluster is named `raycluster-autoscaler` because it has optional Ray Autoscaler support enabled."
|
||||
]
|
||||
|
@ -306,8 +306,7 @@
|
|||
"execution_count": 2,
|
||||
"id": "d3dae5fd",
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
],
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"! kubectl get service raycluster-autoscaler-head-svc\n",
|
||||
"\n",
|
||||
|
@ -379,7 +378,7 @@
|
|||
"\n",
|
||||
"### Deleting a Ray Cluster\n",
|
||||
"\n",
|
||||
"To delete the Ray Cluster we deployed in this example, you can run either of the following commands."
|
||||
"To delete the Ray Cluster we deployed in this example, run the following command."
|
||||
]
|
||||
},
|
||||
{
|
||||
|
@ -393,25 +392,6 @@
|
|||
"! kubectl delete raycluster raycluster-autoscaler"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "d7aa0221",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"**OR**"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "112e6d2e",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"# Delete by reference to the yaml file we used to define the RayCluster CR \n",
|
||||
"! kubectl delete -f kuberay/ray-operator/config/samples/ray-cluster.autoscaler.yaml"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "0de87d9d",
|
||||
|
@ -450,7 +430,7 @@
|
|||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"! kubectl delete -k kuberay/ray-operator/config/default"
|
||||
"! kubectl delete -k \"github.com/ray-project/kuberay/ray-operator/config/default?ref=v0.3.0&timeout=90s\""
|
||||
]
|
||||
},
|
||||
{
|
||||
|
@ -490,7 +470,7 @@
|
|||
"name": "python",
|
||||
"nbconvert_exporter": "python",
|
||||
"pygments_lexer": "ipython3",
|
||||
"version": "3.7.11"
|
||||
"version": "3.8.10"
|
||||
}
|
||||
},
|
||||
"nbformat": 4,
|
||||
|
|
|
@ -44,13 +44,12 @@ First, follow the [quickstart guide](kuberay-quickstart) to create an autoscalin
|
|||
|
||||
```bash
|
||||
# Optionally use kind to run the examples locally.
|
||||
# kind create cluster
|
||||
# $ kind create cluster
|
||||
|
||||
$ git clone https://github.com/ray-project/kuberay -b release-0.3
|
||||
# Create the KubeRay operator.
|
||||
$ kubectl create -k kuberay/ray-operator/config/default
|
||||
$ kubectl create -k "github.com/ray-project/kuberay/ray-operator/config/default?ref=v0.3.0&timeout=90s"
|
||||
# Create an autoscaling Ray cluster.
|
||||
$ kubectl apply -f kuberay/ray-operator/config/samples/ray-cluster.autoscaler.yaml
|
||||
$ kubectl apply -f https://raw.githubusercontent.com/ray-project/kuberay/release-0.3/ray-operator/config/samples/ray-cluster.autoscaler.yaml
|
||||
```
|
||||
|
||||
Now, we can run a Ray program on the head pod that uses [``request_resources``](ref-autoscaler-sdk) to scale the cluster to a total of 3 CPUs. The head and worker pods in our [example cluster config](https://github.com/ray-project/kuberay/blob/master/ray-operator/config/samples/ray-cluster.autoscaler.yaml) each have a capacity of 1 CPU, and we specified a minimum of 1 worker pod. Thus, the request should trigger upscaling of one additional worker pod.
|
||||
|
@ -66,7 +65,7 @@ $ kubectl get pods --selector=ray.io/cluster=raycluster-autoscaler --selector=ra
|
|||
|
||||
Then, we can run the Ray program using ``kubectl exec``:
|
||||
```bash
|
||||
$ kubectl exec raycluster-autoscaler-head-xxxxx -it -c ray-head -- python -c \"import ray; ray.init(); ray.autoscaler.sdk.request_resources(num_cpus=3)
|
||||
$ kubectl exec raycluster-autoscaler-head-xxxxx -it -c ray-head -- python -c "import ray; ray.init(); ray.autoscaler.sdk.request_resources(num_cpus=3)"
|
||||
```
|
||||
|
||||
The last command should have triggered Ray pod upscaling. To confirm the new worker pod is up, let's query the RayCluster's pods again:
|
||||
|
|
|
@ -122,10 +122,7 @@ for instructions on this step.
|
|||
Now, run the following commands to deploy the Fluent Bit ConfigMap and a single-pod RayCluster with
|
||||
a Fluent Bit sidecar.
|
||||
```shell
|
||||
# Starting from the parent of cloned Ray master.
|
||||
pushd ray/doc/source/cluster/kubernetes/configs/
|
||||
kubectl apply -f ray-cluster.log.yaml
|
||||
popd
|
||||
kubectl apply -f https://raw.githubusercontent.com/ray-project/ray/releases/2.0.0/doc/source/cluster/kubernetes/configs/ray-cluster.log.yaml
|
||||
```
|
||||
|
||||
Determine the Ray pod's name with
|
||||
|
@ -145,6 +142,4 @@ kubectl logs raycluster-complete-logs-head-xxxxx -c fluentbit
|
|||
[Fluentd]: https://docs.fluentd.org/
|
||||
[Promtail]: https://grafana.com/docs/loki/latest/clients/promtail/
|
||||
[KubDoc]: https://kubernetes.io/docs/concepts/cluster-administration/logging/
|
||||
<!-- TODO: fix this -->
|
||||
[ConfigLink]: https://github.com/ray-project/ray/tree/master/doc/source/cluster/
|
||||
<!-- [ConfigLink]: https://raw.githubusercontent.com/ray-project/ray/779e9f7c5733ef9a471ad2bb61723158ff942e92/doc/source/cluster/ray-clusters-on-kubernetes/configs/ray-cluster.log.yaml -->
|
||||
[ConfigLink]: https://raw.githubusercontent.com/ray-project/ray/releases/2.0.0/doc/source/cluster/kubernetes/configs/ray-cluster.log.yaml
|
||||
|
|
|
@ -70,13 +70,13 @@ We'll use the {ref}`Ray Job Python SDK <ray-job-sdk>` to submit the XGBoost work
|
|||
```
|
||||
|
||||
To submit the workload, run the above Python script.
|
||||
The script is also available in the Ray repository.
|
||||
The script is available [in the Ray repository][XGBSubmit].
|
||||
|
||||
```shell
|
||||
# From the parent directory of cloned Ray master.
|
||||
pushd ray/doc/source/cluster/doc_code/
|
||||
# Download the above script.
|
||||
curl https://raw.githubusercontent.com/ray-project/ray/releases/2.0.0/doc/source/cluster/doc_code/xgboost_submit.py -o xgboost_submit.py
|
||||
# Run the script.
|
||||
python xgboost_submit.py
|
||||
popd
|
||||
```
|
||||
|
||||
### Observe progress
|
||||
|
@ -130,3 +130,5 @@ Delete your Ray cluster with the following command:
|
|||
```shell
|
||||
ray down -y cluster.yaml
|
||||
```
|
||||
|
||||
[XGBSubmit]: https://github.com/ray-project/ray/blob/releases/2.0.0/doc/source/cluster/doc_code/xgboost_submit.py
|
||||
|
|
|
@ -151,9 +151,8 @@ class StandardAutoscaler:
|
|||
`ray start --head --autoscaling-config=/path/to/config.yaml` on a instance
|
||||
that has permission to launch other instances, or you can also use `ray up
|
||||
/path/to/config.yaml` from your laptop, which will configure the right
|
||||
AWS/Cloud roles automatically. See the documentation for a full definition
|
||||
of autoscaling behavior:
|
||||
https://docs.ray.io/en/master/cluster/autoscaling.html
|
||||
AWS/Cloud roles automatically. See the Ray documentation
|
||||
(https://docs.ray.io/en/latest/) for a full definition of autoscaling behavior.
|
||||
StandardAutoscaler's `update` method is periodically called in
|
||||
`monitor.py`'s monitoring loop.
|
||||
|
||||
|
|
Loading…
Add table
Reference in a new issue