mirror of
https://github.com/vale981/ray
synced 2025-03-05 18:11:42 -05:00
![]() OSS release tests currently run with hardcoded Python 3.7 base. In the future we will want to run tests on different python versions. This PR adds support for a new `python` field in the test configuration. The python field will determine both the base image used in the Buildkite runner docker container (for Ray client compatibility) and the base image for the Anyscale cluster environments. Note that in Buildkite, we will still only wait for the python 3.7 base image before kicking off tests. That is acceptable, as we can assume that most wheels finish in a similar time, so even if we wait for the 3.7 image and kick off a 3.8 test, that runner will wait maybe for 5-10 more minutes. |
||
---|---|---|
.. | ||
distributed | ||
object_store | ||
single_node | ||
app_config.yaml | ||
distributed.yaml | ||
distributed_smoke_test.yaml | ||
many_nodes.yaml | ||
object_store.yaml | ||
README.md | ||
scheduling.yaml | ||
single_node.yaml |
Ray Scalability Envelope
Distributed Benchmarks
All distributed tests are run on 64 nodes with 64 cores/node. Maximum number of nodes is achieved by adding 4 core nodes.
Dimension | Quantity |
---|---|
# nodes in cluster (with trivial task workload) | 250+ |
# actors in cluster (with trivial workload) | 10k+ |
# simultaneously running tasks | 10k+ |
# simultaneously running placement groups | 1k+ |
Object Store Benchmarks
Dimension | Quantity |
---|---|
1 GiB object broadcast (# of nodes) | 50+ |
Single Node Benchmarks.
All single node benchmarks are run on a single m4.16xlarge.
Dimension | Quantity |
---|---|
# of object arguments to a single task | 10000+ |
# of objects returned from a single task | 3000+ |
# of plasma objects in a single ray.get call |
10000+ |
# of tasks queued on a single node | 1,000,000+ |
Maximum ray.get numpy object size |
100GiB+ |