mirror of
https://github.com/vale981/ray
synced 2025-03-05 10:01:43 -05:00
[test][k8s] Restore kubernetes test directory, adds some info (#18982)
This commit is contained in:
parent
aa0cab5cae
commit
bfd706aea3
7 changed files with 87 additions and 0 deletions
|
@ -31,6 +31,7 @@ This checklist is meant to be used in conjunction with the RELEASE_PROCESS.rst d
|
|||
- [ ] Test passing
|
||||
- [ ] Results added to `release/release_logs`
|
||||
- [ ] microbenchmark
|
||||
- [ ] `kubernetes` manual release tests pass
|
||||
|
||||
- [ ] ``weekly`` release test suite
|
||||
- [ ] Test passing
|
||||
|
|
|
@ -172,6 +172,9 @@ Release tests are added and maintained by the respective teams.
|
|||
As another example, if you just want to kick off all nightly RLLib tests,
|
||||
select the respective test suite and specify ``rllib`` in the test file filter.
|
||||
|
||||
6. **Kubernetes tests must be run manually.** Refer to ``kubernetes_manual_tests/README.md``.
|
||||
Feel free to ping code owner(s) of OSS Kubernetes support to run these.
|
||||
|
||||
Identify and Resolve Release Blockers
|
||||
-------------------------------------
|
||||
If a release blocking issue arises in the course of testing, you should
|
||||
|
|
25
release/kubernetes_manual_tests/README.md
Normal file
25
release/kubernetes_manual_tests/README.md
Normal file
|
@ -0,0 +1,25 @@
|
|||
# ray-k8s-tests
|
||||
|
||||
These tests are not automated and thus **must be run manually** for each release.
|
||||
If you have issues running them, bug the code owner(s) for OSS Kubernetes support.
|
||||
|
||||
## How to run
|
||||
1. Configure kubectl and Helm 3 to access a K8s cluster.
|
||||
2. `git checkout releases/<release version>`
|
||||
3. You might have to locally pip install the Ray wheel for the relevant commit (or pip install -e) in a conda env, see Ray client note below.
|
||||
4. cd to this directory
|
||||
5. `IMAGE=rayproject/ray:<release version> bash k8s_release_tests.sh`
|
||||
6. Test outcomes will be reported at the end of the output.
|
||||
|
||||
This runs three tests and does the necessary resource creation/teardown. The tests typically take about 15 minutes to finish.
|
||||
|
||||
## Notes
|
||||
0. Anyscale employees: You should have access to create a K8s cluster using either GKE or EKS, ask OSS Kubernetes code owner if in doubt.
|
||||
1. Your Ray cluster should be able to accomodate 30 1-CPU pods to run all of the tests.
|
||||
2. These tests use basic Ray client functionality -- your locally installed Ray version may need to be updated to match the one in the release image.
|
||||
3. The tests do a poor job of Ray client port-forwarding process clean-up -- if a test fails, it's possible there might be a port-forwarding process stuck running in the background. To identify the rogue process run `ps aux | grep "port-forward"`. Then `kill` it.
|
||||
4. There are some errors that will appear on the screen during the run -- that's normal, error recovery is being tested.
|
||||
|
||||
## Running individual tests
|
||||
To run any of the three individual tests, substitute in step 5 of **How to Run** `k8s-test.sh` or `helm-test.sh` or `k8s-test-scale.sh`.
|
||||
It's the last of these that needs 30 1-cpu pods. 10 is enough for either of the other two. The scale test is currently somewhat flaky. Rerun it if it fails.
|
8
release/kubernetes_manual_tests/helm-test.sh
Executable file
8
release/kubernetes_manual_tests/helm-test.sh
Executable file
|
@ -0,0 +1,8 @@
|
|||
#!/bin/bash
|
||||
set -x
|
||||
kubectl create namespace helm-test
|
||||
kubectl create namespace helm-test2
|
||||
KUBERNETES_OPERATOR_TEST_NAMESPACE=helm-test KUBERNETES_OPERATOR_TEST_IMAGE="$IMAGE" python ../../python/ray/tests/kubernetes_e2e/test_helm.py
|
||||
kubectl delete namespace helm-test
|
||||
kubectl delete namespace helm-test2
|
||||
kubectl delete -f ../../deploy/charts/ray/crds/cluster_crd.yaml
|
11
release/kubernetes_manual_tests/k8s-test-scale.sh
Executable file
11
release/kubernetes_manual_tests/k8s-test-scale.sh
Executable file
|
@ -0,0 +1,11 @@
|
|||
#!/bin/bash
|
||||
set -x
|
||||
kubectl create namespace scale-test
|
||||
kubectl create namespace scale-test2
|
||||
KUBERNETES_OPERATOR_TEST_NAMESPACE=scale-test KUBERNETES_OPERATOR_TEST_IMAGE="$IMAGE" python ../../python/ray/tests/kubernetes_e2e/test_k8s_operator_scaling.py
|
||||
kubectl -n scale-test delete --all rayclusters
|
||||
kubectl -n scale-test2 delete --all rayclusters
|
||||
kubectl delete -f ../../deploy/components/operator_cluster_scoped.yaml
|
||||
kubectl delete namespace scale-test
|
||||
kubectl delete namespace scale-test2
|
||||
kubectl delete -f ../../deploy/charts/ray/crds/cluster_crd.yaml
|
9
release/kubernetes_manual_tests/k8s-test.sh
Executable file
9
release/kubernetes_manual_tests/k8s-test.sh
Executable file
|
@ -0,0 +1,9 @@
|
|||
#!/bin/bash
|
||||
set -x
|
||||
kubectl create namespace basic-test
|
||||
kubectl apply -f ../../deploy/charts/ray/crds/cluster_crd.yaml
|
||||
KUBERNETES_OPERATOR_TEST_NAMESPACE=basic-test KUBERNETES_OPERATOR_TEST_IMAGE="$IMAGE" python ../../python/ray/tests/kubernetes_e2e/test_k8s_operator_basic.py
|
||||
kubectl -n basic-test delete --all rayclusters
|
||||
kubectl -n basic-test delete deployment ray-operator
|
||||
kubectl delete namespace basic-test
|
||||
kubectl delete -f ../../deploy/charts/ray/crds/cluster_crd.yaml
|
30
release/kubernetes_manual_tests/k8s_release_tests.sh
Normal file
30
release/kubernetes_manual_tests/k8s_release_tests.sh
Normal file
|
@ -0,0 +1,30 @@
|
|||
#!/bin/bash
|
||||
set -x
|
||||
IMAGE="$IMAGE" bash k8s-test.sh
|
||||
BASIC_SUCCEEDED=$?
|
||||
IMAGE="$IMAGE" bash helm-test.sh
|
||||
HELM_SUCCEEDED=$?
|
||||
IMAGE="$IMAGE" bash k8s-test-scale.sh
|
||||
SCALE_SUCCEEDED=$?
|
||||
|
||||
if (( BASIC_SUCCEEDED == 0 ))
|
||||
then
|
||||
echo "k8s-test.sh succeeded"
|
||||
else
|
||||
echo "k8s-test.sh test failed"
|
||||
fi
|
||||
|
||||
if (( HELM_SUCCEEDED == 0 ))
|
||||
then
|
||||
echo "helm-test.sh test succeeded";
|
||||
else
|
||||
echo "helm-test.sh test failed"
|
||||
fi
|
||||
|
||||
if (( SCALE_SUCCEEDED == 0))
|
||||
then
|
||||
echo "k8s-test-scale.sh test succeeded";
|
||||
else
|
||||
echo "k8s-test-scale.sh failed. Try re-running just the k8s-test-scale.sh. It's expected to be flaky."
|
||||
fi
|
||||
|
Loading…
Add table
Reference in a new issue