ray/doc/source/data/examples/index.rst
Philipp Moritz 1ba8c8cc67
[Examples] OCR Ray Datasets example (#25930)
This is a simple example that shows how to do OCR with Ray Datasets. It includes:

- How to upload and download the dataset to and from S3
- How to run OCR on the dataset with tesseract
- How to use actors to keep around and re-use a spaCy context for doing NLP on the data

Co-authored-by: Clark Zinzow <clarkzinzow@gmail.com>
2022-07-06 13:11:26 -07:00

61 lines
1.5 KiB
ReStructuredText

.. _datasets-examples-ref:
========
Examples
========
.. tip:: Check out the Datasets :ref:`User Guide <data_user_guide>` to learn more about
Datasets' features in-depth.
.. _datasets-recipes:
Simple Data Processing Examples
-------------------------------
Ray Datasets is a data processing engine that supports multiple data
modalities and types. Here you will find a few end-to-end examples of some basic data
processing with Ray Datasets on tabular data, text (coming soon!), and imagery (coming
soon!).
.. panels::
:container: container pb-4
:column: col-md-4 px-2 py-2
:img-top-cls: pt-5 w-75 d-block mx-auto
---
:img-top: /images/taxi.png
+++
.. link-button:: nyc_taxi_basic_processing
:type: ref
:text: Processing NYC taxi data using Ray Datasets
:classes: btn-link btn-block stretched-link
---
:img-top: /images/ocr.jpg
+++
.. link-button:: ocr_example
:type: ref
:text: Optical character recognition using Ray Datasets
:classes: btn-link btn-block stretched-link
Scaling Out Datasets Workloads
------------------------------
These examples demonstrate using Ray Datasets on large-scale data over a multi-node Ray
cluster.
.. panels::
:container: container pb-4
:column: col-md-4 px-2 py-2
:img-top-cls: pt-5 w-75 d-block mx-auto
---
:img-top: /images/dataset-repeat-2.svg
+++
.. link-button:: big_data_ingestion
:type: ref
:text: Large-scale ML Ingest
:classes: btn-link btn-block stretched-link