mirror of
https://github.com/vale981/ray
synced 2025-03-07 02:51:39 -05:00
![]() Buffering writes to AWS S3 is highly recommended to maximize throughput. Reducing the number of remote I/O requests can make spilling to remote storages as effective as spilling locally. In a test where 512GB of objects were created and spilled, varying just the buffer size while spilling to a S3 bucket resulted in the following runtimes. Buffer Size | Runtime (s) -- | -- Default | 3221.865916 256KB | 1758.885839 1MB | 748.226089 10MB | 526.406466 100MB | 494.830513 Based on these results, a default buffer size of 1MB has been added. This is the minimum buffer size used by AWS Kinesis Firehose, a streaming service for S3. On systems with larger availability, it is good to configure a larger buffer size. For processes that reach the throughput limits provided by S3, we can remove that bottleneck by supporting more prefixes/buckets. These impacts are less noticeable as the performance gains from using a large buffer prevent us from reaching a bottleneck. The following runtimes were achieved by spilling 512GB with a 1MB buffer and varying prefixes. Prefixes | Runtime (s) -- | -- 1 | 748.226089 3 | 527.658646 10 | 516.010742 Together these changes enable faster large-scale object spilling. Co-authored-by: Ubuntu <ubuntu@ip-172-31-54-240.us-west-2.compute.internal> |
||
---|---|---|
.. | ||
_examples/datasets_train | ||
doc_code | ||
examples | ||
images | ||
actors.rst | ||
advanced.rst | ||
async_api.rst | ||
concurrency_group_api.rst | ||
configure.rst | ||
cross-language.rst | ||
fault-tolerance.rst | ||
handling-dependencies.rst | ||
memory-management.rst | ||
namespaces.rst | ||
package-ref.rst | ||
placement-group.rst | ||
ray-dashboard.rst | ||
serialization.rst | ||
starting-ray.rst | ||
tips-for-first-time.rst | ||
troubleshooting.rst | ||
using-ray-with-gpus.rst | ||
using-ray-with-jupyter.rst | ||
using-ray.rst | ||
walkthrough.rst |