ray/release/benchmarks
mwtian 513881584d
[Core] install jemalloc in Ray docker and use jemalloc in benchmark release tests (#26112)
There are mysterious memory usage growth in Ray clusters that disappear when running with jemalloc. Before we are able to figure out the root cause, it seems using jemalloc by default can be a good walkaround. Because of its efficiency, using jemalloc by default can be beneficial, but we need to run more benchmarks to verify.
2022-06-27 23:26:56 -07:00
..
distributed lower the utilization threshold in many tasks scheduling test by 5% (#24758) 2022-05-13 10:44:58 -07:00
object_store [Release Test] Add perf metrics for core scalability tests (#23110) 2022-03-14 10:20:39 +09:00
single_node [Release Test] Add perf metrics for core scalability tests (#23110) 2022-03-14 10:20:39 +09:00
app_config.yaml [Core] install jemalloc in Ray docker and use jemalloc in benchmark release tests (#26112) 2022-06-27 23:26:56 -07:00
distributed.yaml Migrate scalability tests (#22901) 2022-03-08 17:22:41 -08:00
distributed_smoke_test.yaml Migrate scalability tests (#22901) 2022-03-08 17:22:41 -08:00
many_nodes.yaml [Nightly tests] Improve k8s testing (#23108) 2022-03-14 03:49:15 -07:00
object_store.yaml [Nightly tests] Improve k8s testing (#23108) 2022-03-14 03:49:15 -07:00
README.md Migrate scalability tests (#22901) 2022-03-08 17:22:41 -08:00
scheduling.yaml use smaller instance for scheduling tests (#25635) 2022-06-10 17:09:35 +00:00
single_node.yaml Migrate scalability tests (#22901) 2022-03-08 17:22:41 -08:00

Ray Scalability Envelope

Distributed Benchmarks

All distributed tests are run on 64 nodes with 64 cores/node. Maximum number of nodes is achieved by adding 4 core nodes.

Dimension Quantity
# nodes in cluster (with trivial task workload) 250+
# actors in cluster (with trivial workload) 10k+
# simultaneously running tasks 10k+
# simultaneously running placement groups 1k+

Object Store Benchmarks

Dimension Quantity
1 GiB object broadcast (# of nodes) 50+

Single Node Benchmarks.

All single node benchmarks are run on a single m4.16xlarge.

Dimension Quantity
# of object arguments to a single task 10000+
# of objects returned from a single task 3000+
# of plasma objects in a single ray.get call 10000+
# of tasks queued on a single node 1,000,000+
Maximum ray.get numpy object size 100GiB+