ray/doc/source
Kenneth 9b67cb5a6f
Add buffering to object spilling (#22618)
This change is needed for object fusing to see performance increases on HDD. Currently, smaller object writes are slow even with fusing since the writes are not buffered (negating the point of fusing). Benchmarks show that while the default is sufficient for fast SSDs, on a slow HDD, increasing the buffer size reduces write times by several magnitudes.

### Performance Changes
A microbenchmark where 500KB objects were produced (then spilled) and consumed to observe changes in object fusing/spilling.

| Run | Produce (s) | Consume (s) | Total (s) |
| -- | -- | -- | -- |
| Baseline (original) | 347.332281 | 355.611272 | 705.560750 |
| Baseline (w/ fix) | 181.815852 | 347.692850 | 532.847759 |
| No fusing (original) | 453.574554 | 525.047998 | 981.620108 |
| No fusing (w/ fix) | 452.614848| 519.787698 | 975.412639 |

The baseline runs should be notably faster due to object fusing reducing I/O requests. With the fix, Ray's defaults allow this microbenchmark to have a 48% time reduction with negligible impact on runtime when fusing is disabled.

See [this followup](https://github.com/ray-project/ray/pull/22618#issuecomment-1054838715) for information on the differences between SSD and HDD performance with different buffer sizes.

Co-authored-by: Ubuntu <ubuntu@ip-172-31-54-240.us-west-2.compute.internal>
2022-03-01 10:13:10 -08:00
..
_includes [docs] RLlib concepts consolidation, user guide, RL conf prep (#22496) 2022-02-18 09:35:20 -08:00
_static [docs] RLlib concepts consolidation, user guide, RL conf prep (#22496) 2022-02-18 09:35:20 -08:00
_templates [Docs] Trial Fathom analytics for doc pages (#17056) 2021-07-14 14:11:52 -07:00
cluster [KubeRay] Provide a new Dockerfile for fast build (#22689) 2022-02-28 17:09:16 -08:00
data Revert "Support creating a DatasetPipeline windowed by bytes (#22577)" (#22695) 2022-02-28 11:56:12 -08:00
images [docs] Tune overhaul part II (#22656) 2022-02-26 23:07:34 -08:00
ray-contribute [GCS-Ray] update doc and error message for GCS-Ray (#22528) 2022-02-22 17:56:30 -08:00
ray-core Add buffering to object spilling (#22618) 2022-03-01 10:13:10 -08:00
ray-design-patterns [docs] Make design pattern example self contained (#20981) 2021-12-09 20:19:38 -08:00
ray-more-libs [docs] new structure (#21776) 2022-01-21 15:42:05 -08:00
ray-observability [GCS-Ray] update doc and error message for GCS-Ray (#22528) 2022-02-22 17:56:30 -08:00
ray-overview [docs] sphinx gallery removal, migrate to ipynb (#22467) 2022-02-19 01:19:07 -08:00
ray-references [Doc] [Jobs] add CLI and SDK reference to docs (#22680) 2022-02-28 17:57:46 -06:00
raysgd [GCS-Ray] update doc and error message for GCS-Ray (#22528) 2022-02-22 17:56:30 -08:00
rllib [RLlib] SlateQ: framework=tf fixes and SlateQ documentation update (#22543) 2022-02-23 13:03:45 +01:00
serve [serve] Support user-provided health check via def check_health(self) method (#22178) 2022-02-11 12:53:37 -06:00
train [Train] Update docs for ray.train.torch import (#22555) 2022-02-23 19:22:27 -08:00
tune [docs] Tune overhaul part II (#22656) 2022-02-26 23:07:34 -08:00
workflows [docs] landing page (fixes #21750) (#21859) 2022-01-26 17:14:25 -08:00
_toc.yml [docs] Tune overhaul part II (#22656) 2022-02-26 23:07:34 -08:00
conf.py [docs] sphinx gallery removal, migrate to ipynb (#22467) 2022-02-19 01:19:07 -08:00
custom_directives.py [docs] sphinx gallery removal, migrate to ipynb (#22467) 2022-02-19 01:19:07 -08:00
index.md [Docs] Ray Data docs target state (#21931) 2022-01-27 13:14:36 -08:00