ray/python/requirements/data_processing
siddgoel 0722cbb37e
Add support for snappy text decompression #22298 (#22486)
Adds a streaming based reading option for Snappy-compressed files. Arrow doesn't support streaming Snappy decompression since the canonical C++ Snappy library doesn't natively support streaming decompression. This PR works around this by doing streaming reads of snappy-compressed files using the streaming decompression API provided in the [python-snappy](https://github.com/andrix/python-snappy) package.

This commit supplies a custom datasource that uses Arrow + [python-snappy](https://github.com/andrix/python-snappy) to read and decompress Snappy-compressed files.

Co-authored-by: siddharth.goel <siddharth.goel@bytedance.com>
Co-authored-by: Chen Shen <scv119@gmail.com>
2022-03-15 13:52:22 -07:00
..
requirements.txt [data](deps): Bump dask[complete] (#22334) 2022-02-14 12:44:20 -08:00
requirements_dataset.txt Add support for snappy text decompression #22298 (#22486) 2022-03-15 13:52:22 -07:00