ray/doc/source/data
Jian Xiao 9fe4dba4ad
Revamp the Getting Started page for Dataset (#24860)
This is part of the Dataset GA doc fix effort to update/improve the documentation.
This PR revamps the Getting Started page.

What are the changes:
- Focus on basic/core features that are bread-and-butter for users, leave the advanced features out
- Focus on high level introduction, leave the detailed spec out (e.g. what are possible batch_types for map_batches() API)
- Use more realistic (yet still simple) data example that's familiar to people (IRIS dataset in this case)
- Use the same data example throughout to make it context-switch free
- Use runnable code rather than faked
- Reference to the code from doc, instead of inlining them in the doc

Co-authored-by: Ubuntu <ubuntu@ip-172-31-32-136.us-west-2.compute.internal>
Co-authored-by: Eric Liang <ekhliang@gmail.com>
2022-05-18 13:46:23 -07:00
..
doc_code Revamp the Getting Started page for Dataset (#24860) 2022-05-18 13:46:23 -07:00
examples [Doc][Data] fix big-data-ingestion broken links (#24631) 2022-05-10 09:04:41 -07:00
images [minor] Fix incorrect link to ray core user guide (#23316) 2022-03-17 20:58:56 -07:00
modin Fix broken links in documentation and put linkcheck linter in place on CI (#23340) 2022-03-18 21:02:52 -07:00
accessing-datasets.rst Cleanup the DatasetPipeline references in Getting Started; rename Exchanging to Accessing (#23786) 2022-04-12 17:10:14 -07:00
advanced-pipelines.rst [minor] Fix incorrect link to ray core user guide (#23316) 2022-03-17 20:58:56 -07:00
big_data_ingestion.yaml Revert "[docs] Clean up doc structure (first part) (#21667)" (#21763) 2022-01-20 15:30:56 -08:00
creating-datasets.rst [Datasets] Overhaul of "Creating Datasets" feature guide. (#24831) 2022-05-17 16:23:42 -07:00
custom-data.rst [Datasets] Overhaul of "Creating Datasets" feature guide. (#24831) 2022-05-17 16:23:42 -07:00
dask-on-ray.rst Update dask version for Ray 1.12.0 (#23197) 2022-03-15 19:22:19 -07:00
dataset-ml-preprocessing.rst [Dataset GA doc] Decompose the monolith of Getting Started page (and get them under User Guide) (#23311) 2022-03-18 11:25:43 -07:00
dataset-tensor-support.rst [Datasets] Support tensor columns in to_tf and to_torch. (#24752) 2022-05-17 01:11:00 -07:00
dataset.rst Revamp the Getting Started page for Dataset (#24860) 2022-05-18 13:46:23 -07:00
getting-started.rst Revamp the Getting Started page for Dataset (#24860) 2022-05-18 13:46:23 -07:00
integrations.rst Revamp the Getting Started page for Dataset (#24860) 2022-05-18 13:46:23 -07:00
key-concepts.rst Cleanup the DatasetPipeline references in Getting Started; rename Exchanging to Accessing (#23786) 2022-04-12 17:10:14 -07:00
mars-on-ray.rst [Datasets] Integrate Mars-on-Ray with Datasets; improve docs and add tests (#23402) 2022-04-29 09:43:52 -07:00
package-ref.rst [Datasets] Change range_arrow() API to range_table() (#24704) 2022-05-17 01:09:45 -07:00
performance-tips.rst [doc] Add docs for push-based shuffle in Datasets (#24486) 2022-05-05 14:59:33 -07:00
pipelining-compute.rst Remove dataset pipeline from the Getting Started page (#23756) 2022-04-07 12:52:04 -07:00
random-access.rst [Datasets] Change range_arrow() API to range_table() (#24704) 2022-05-17 01:09:45 -07:00
raydp.rst [Docs] Ray Data docs target state (#21931) 2022-01-27 13:14:36 -08:00
saving-datasets.rst [Dataset GA doc] Decompose the monolith of Getting Started page (and get them under User Guide) (#23311) 2022-03-18 11:25:43 -07:00
transforming-datasets.rst [Datasets] Change range_arrow() API to range_table() (#24704) 2022-05-17 01:09:45 -07:00
user-guide.rst Revamp the Getting Started page for Dataset (#24860) 2022-05-18 13:46:23 -07:00