ray/benchmarks
Yi Cheng b6b4d4cf57
[test] Update base image for nightly testing (#20680)
## Why are these changes needed?

`base_image: "anyscale/ray-ml:pinned-nightly-py37"` doesn't exist anymore which fails a lot of nightly tests, change to `base_image: "anyscale/ray-ml:nightly-py37-gpu"`
## Related issue number

## Checks
2021-11-23 11:06:44 -08:00
..
distributed Revert "[core] Refactor test_many_tasks (#18169)" (#18216) 2021-08-30 10:35:23 -07:00
object_store [Scalability Envelope] Include broadcast time in test_object_store result json (#18974) 2021-09-29 13:49:16 -07:00
single_node Fix test_single_node json report (#19075) 2021-10-04 13:05:32 -07:00
app_config.yaml [test] Update base image for nightly testing (#20680) 2021-11-23 11:06:44 -08:00
benchmark_tests.yaml Split scalability envelope + smoke tests (#17455) 2021-07-30 10:20:19 -07:00
distributed.yaml Split scalability envelope + smoke tests (#17455) 2021-07-30 10:20:19 -07:00
distributed_smoke_test.yaml Split scalability envelope + smoke tests (#17455) 2021-07-30 10:20:19 -07:00
many_nodes.yaml Split scalability envelope + smoke tests (#17455) 2021-07-30 10:20:19 -07:00
object_store.yaml Integrate scalability envelope with releaser (#16417) 2021-06-15 10:42:55 -07:00
README.md Move scalability envelope back down to 250 nodes (#15381) 2021-04-16 19:39:24 -07:00
single_node.yaml Integrate scalability envelope with releaser (#16417) 2021-06-15 10:42:55 -07:00

Ray Scalability Envelope

Distributed Benchmarks

All distributed tests are run on 64 nodes with 64 cores/node. Maximum number of nodes is achieved by adding 4 core nodes.

Dimension Quantity
# nodes in cluster (with trivial task workload) 250+
# actors in cluster (with trivial workload) 10k+
# simultaneously running tasks 10k+
# simultaneously running placement groups 1k+

Object Store Benchmarks

Dimension Quantity
1 GiB object broadcast (# of nodes) 50+

Single Node Benchmarks.

All single node benchmarks are run on a single m4.16xlarge.

Dimension Quantity
# of object arguments to a single task 10000+
# of objects returned from a single task 3000+
# of plasma objects in a single ray.get call 10000+
# of tasks queued on a single node 1,000,000+
Maximum ray.get numpy object size 100GiB+