ray/doc/source/data
Clark Zinzow 841f7c81ff
[Datasets] [Tensor Story - 1/2] Automatically provide tensor views to UDFs and infer tensor blocks for pure-tensor datasets. (#24812)
This PR makes several improvements to the Datasets' tensor story. See the issues for each item for more details.

- Automatically infer tensor blocks (single-column tables representing a single tensor) when returning NumPy ndarrays from map_batches(), map(), and flat_map().
- Automatically infer tensor columns when building tabular blocks in general.
- Fixes shuffling and sorting for tensor columns

This should improve the UX/efficiency of the following:

- Working with pure-tensor datasets in general.
- Mapping tensor UDFs over pure-tensor, a better foundation for tensor-native preprocessing for end-users and AIR.
2022-05-19 22:40:04 -07:00
..
doc_code Revamp the Saving Datasets user guide (#24987) 2022-05-19 15:40:12 -07:00
examples [Datasets] Add basic e2e Datasets example on NYC taxi dataset (#24874) 2022-05-19 12:54:25 -07:00
images [minor] Fix incorrect link to ray core user guide (#23316) 2022-03-17 20:58:56 -07:00
modin Fix broken links in documentation and put linkcheck linter in place on CI (#23340) 2022-03-18 21:02:52 -07:00
accessing-datasets.rst [Datasets] Overhaul "Accessing Datasets" feature guide. (#24963) 2022-05-19 12:50:00 -07:00
advanced-pipelines.rst [Datasets] Miscellaneous GA docs P0s. (#24891) 2022-05-18 16:17:48 -07:00
big_data_ingestion.yaml Revert "[docs] Clean up doc structure (first part) (#21667)" (#21763) 2022-01-20 15:30:56 -08:00
creating-datasets.rst [Datasets] Miscellaneous GA docs P0s. (#24891) 2022-05-18 16:17:48 -07:00
custom-data.rst [Datasets] Overhaul of "Creating Datasets" feature guide. (#24831) 2022-05-17 16:23:42 -07:00
dask-on-ray.rst Update dask version for Ray 1.12.0 (#23197) 2022-03-15 19:22:19 -07:00
dataset-ml-preprocessing.rst [Dataset GA doc] Decompose the monolith of Getting Started page (and get them under User Guide) (#23311) 2022-03-18 11:25:43 -07:00
dataset-tensor-support.rst [Datasets] [Tensor Story - 1/2] Automatically provide tensor views to UDFs and infer tensor blocks for pure-tensor datasets. (#24812) 2022-05-19 22:40:04 -07:00
dataset.rst [Datasets] Add FAQ to Datasets docs. (#24932) 2022-05-19 15:44:22 -07:00
faq.rst [Datasets] Add FAQ to Datasets docs. (#24932) 2022-05-19 15:44:22 -07:00
getting-started.rst Revamp the Getting Started page for Dataset (#24860) 2022-05-18 13:46:23 -07:00
integrations.rst Revamp the Getting Started page for Dataset (#24860) 2022-05-18 13:46:23 -07:00
key-concepts.rst [Datasets] Miscellaneous GA docs P0s. (#24891) 2022-05-18 16:17:48 -07:00
mars-on-ray.rst [Datasets] Integrate Mars-on-Ray with Datasets; improve docs and add tests (#23402) 2022-04-29 09:43:52 -07:00
package-ref.rst [Datasets] Change range_arrow() API to range_table() (#24704) 2022-05-17 01:09:45 -07:00
performance-tips.rst [doc] Add docs for push-based shuffle in Datasets (#24486) 2022-05-05 14:59:33 -07:00
pipelining-compute.rst Remove dataset pipeline from the Getting Started page (#23756) 2022-04-07 12:52:04 -07:00
random-access.rst [Datasets] Overhaul "Accessing Datasets" feature guide. (#24963) 2022-05-19 12:50:00 -07:00
raydp.rst [Docs] Ray Data docs target state (#21931) 2022-01-27 13:14:36 -08:00
saving-datasets.rst Revamp the Saving Datasets user guide (#24987) 2022-05-19 15:40:12 -07:00
transforming-datasets.rst [Datasets] Change range_arrow() API to range_table() (#24704) 2022-05-17 01:09:45 -07:00
user-guide.rst Revamp the Getting Started page for Dataset (#24860) 2022-05-18 13:46:23 -07:00