ray/release/benchmarks
Kai Fricke 6c5229295e
[ci/release] Support running tests with different python versions (#24843)
OSS release tests currently run with hardcoded Python 3.7 base. In the future we will want to run tests on different python versions. 
This PR adds support for a new `python` field in the test configuration. The python field will determine both the base image used in the Buildkite runner docker container (for Ray client compatibility) and the base image for the Anyscale cluster environments. 

Note that in Buildkite, we will still only wait for the python 3.7 base image before kicking off tests. That is acceptable, as we can assume that most wheels finish in a similar time, so even if we wait for the 3.7 image and kick off a 3.8 test, that runner will wait maybe for 5-10 more minutes.
2022-05-17 17:03:12 +01:00
..
distributed lower the utilization threshold in many tasks scheduling test by 5% (#24758) 2022-05-13 10:44:58 -07:00
object_store [Release Test] Add perf metrics for core scalability tests (#23110) 2022-03-14 10:20:39 +09:00
single_node [Release Test] Add perf metrics for core scalability tests (#23110) 2022-03-14 10:20:39 +09:00
app_config.yaml [ci/release] Support running tests with different python versions (#24843) 2022-05-17 17:03:12 +01:00
distributed.yaml Migrate scalability tests (#22901) 2022-03-08 17:22:41 -08:00
distributed_smoke_test.yaml Migrate scalability tests (#22901) 2022-03-08 17:22:41 -08:00
many_nodes.yaml [Nightly tests] Improve k8s testing (#23108) 2022-03-14 03:49:15 -07:00
object_store.yaml [Nightly tests] Improve k8s testing (#23108) 2022-03-14 03:49:15 -07:00
README.md Migrate scalability tests (#22901) 2022-03-08 17:22:41 -08:00
scheduling.yaml Migrate scalability tests (#22901) 2022-03-08 17:22:41 -08:00
single_node.yaml Migrate scalability tests (#22901) 2022-03-08 17:22:41 -08:00

Ray Scalability Envelope

Distributed Benchmarks

All distributed tests are run on 64 nodes with 64 cores/node. Maximum number of nodes is achieved by adding 4 core nodes.

Dimension Quantity
# nodes in cluster (with trivial task workload) 250+
# actors in cluster (with trivial workload) 10k+
# simultaneously running tasks 10k+
# simultaneously running placement groups 1k+

Object Store Benchmarks

Dimension Quantity
1 GiB object broadcast (# of nodes) 50+

Single Node Benchmarks.

All single node benchmarks are run on a single m4.16xlarge.

Dimension Quantity
# of object arguments to a single task 10000+
# of objects returned from a single task 3000+
# of plasma objects in a single ray.get call 10000+
# of tasks queued on a single node 1,000,000+
Maximum ray.get numpy object size 100GiB+