2022-07-16 15:34:31 -07:00
Benchmarks
==========
2022-07-15 15:33:48 -07:00
Below we document key performance benchmarks for common AIR tasks and workflows.
2022-07-15 22:01:23 -07:00
Bulk Ingest
-----------
This task uses the DummyTrainer module to ingest 200GiB of synthetic data.
We test out the performance across different cluster sizes.
- `Bulk Ingest Script`_
- `Bulk Ingest Cluster Configuration`_
For this benchmark, we configured the nodes to have reasonable disk size and throughput to account for object spilling.
.. code-block :: yaml
aws:
BlockDeviceMappings:
- DeviceName: /dev/sda1
Ebs:
Iops: 5000
Throughput: 1000
VolumeSize: 1000
VolumeType: gp3
.. list-table ::
* - **Cluster Setup**
2022-07-16 17:37:06 -07:00
- **Performance**
- **Disk Spill**
2022-07-15 22:01:23 -07:00
- **Command**
2022-07-16 17:37:06 -07:00
* - 1 m5.4xlarge node (1 actor)
- 390 s (0.51 GiB/s)
2022-07-15 22:01:23 -07:00
- 205 GiB
2022-07-16 17:37:06 -07:00
- `python data_benchmark.py --dataset-size-gb=200 --num-workers=1`
* - 5 m5.4xlarge nodes (2 actors)
- 70 s (2.85 GiB/S)
2022-07-15 22:01:23 -07:00
- 206 GiB
2022-07-16 17:37:06 -07:00
- `python data_benchmark.py --dataset-size-gb=200 --num-workers=5`
* - 20 m5.4xlarge nodes (20 actors)
- 3.8 s (52.6 GiB/s)
2022-07-15 22:01:23 -07:00
- 0 GB
2022-07-16 17:37:06 -07:00
- `python data_benchmark.py --dataset-size-gb=200 --num-workers=20`
2022-07-15 22:01:23 -07:00
2022-07-15 15:33:48 -07:00
XGBoost Batch Prediction
------------------------
This task uses the BatchPredictor module to process different amounts of data
using an XGBoost model.
We test out the performance across different cluster sizes and data sizes.
- `XGBoost Prediction Script`_
2022-07-15 22:01:23 -07:00
- `XGBoost Cluster Configuration`_
2022-07-15 15:33:48 -07:00
.. TODO: Add script for generating data and running the benchmark.
.. list-table ::
* - **Cluster Setup**
- **Data Size**
2022-07-16 17:37:06 -07:00
- **Performance**
2022-07-15 15:33:48 -07:00
- **Command**
2022-07-16 17:37:06 -07:00
* - 1 m5.4xlarge node (1 actor)
- 10 GB (26M rows)
- 275 s (94.5k rows/s)
2022-07-15 15:33:48 -07:00
- `python xgboost_benchmark.py --size 10GB`
2022-07-16 17:37:06 -07:00
* - 10 m5.4xlarge nodes (10 actors)
- 100 GB (260M rows)
- 331 s (786k rows/s)
2022-07-15 15:33:48 -07:00
- `python xgboost_benchmark.py --size 100GB`
XGBoost training
----------------
This task uses the XGBoostTrainer module to train on different sizes of data
with different amounts of parallelism.
XGBoost parameters were kept as defaults for xgboost==1.6.1 this task.
- `XGBoost Training Script`_
2022-07-15 22:01:23 -07:00
- `XGBoost Cluster Configuration`_
2022-07-15 15:33:48 -07:00
.. list-table ::
* - **Cluster Setup**
- **Data Size**
2022-07-16 17:37:06 -07:00
- **Performance**
2022-07-15 15:33:48 -07:00
- **Command**
2022-07-16 17:37:06 -07:00
* - 1 m5.4xlarge node (1 actor)
- 10 GB (26M rows)
2022-07-15 15:33:48 -07:00
- 692 s
- `python xgboost_benchmark.py --size 10GB`
2022-07-16 17:37:06 -07:00
* - 10 m5.4xlarge nodes (10 actors)
- 100 GB (260M rows)
2022-07-15 15:33:48 -07:00
- 693 s
- `python xgboost_benchmark.py --size 100GB`
2022-07-15 22:01:23 -07:00
.. _`Bulk Ingest Script`: https://github.com/ray-project/ray/blob/a30bdf9ef34a45f973b589993f7707a763df6ebf/release/air_tests/air_benchmarks/workloads/data_benchmark.py#L25-L40
.. _`Bulk Ingest Cluster Configuration`: https://github.com/ray-project/ray/blob/a30bdf9ef34a45f973b589993f7707a763df6ebf/release/air_tests/air_benchmarks/data_20_nodes.yaml#L6-L15
2022-07-15 15:33:48 -07:00
.. _`XGBoost Training Script`: https://github.com/ray-project/ray/blob/a241e6a0f5a630d6ed5b84cce30c51963834d15b/release/air_tests/air_benchmarks/workloads/xgboost_benchmark.py#L40-L58
.. _`XGBoost Prediction Script`: https://github.com/ray-project/ray/blob/a241e6a0f5a630d6ed5b84cce30c51963834d15b/release/air_tests/air_benchmarks/workloads/xgboost_benchmark.py#L63-L71
2022-07-15 22:01:23 -07:00
.. _`XGBoost Cluster Configuration`: https://github.com/ray-project/ray/blob/a241e6a0f5a630d6ed5b84cce30c51963834d15b/release/air_tests/air_benchmarks/xgboost_compute_tpl.yaml#L6-L24