ray/doc/source/data at e9068c45faf2928cd8e38f6d38c5260cb06130d2 - hiro/ray

hiro/ray

mirror of https://github.com/vale981/ray synced 2025-03-08 19:41:38 -05:00

History

Eric Liang e9068c45fa [data] Instrument most remaining dataset functions and add docs (#21412 ) This PR finishes most of the stats todos for dataset. The main thing punted for future work is instrumentation of split(), which is particularly tricky since only certain blocks are transformed. Co-authored-by: Clark Zinzow <clarkzinzow@gmail.com>		2022-01-06 17:08:56 -08:00
..
_examples	[Train][Data] Change usages of `iter_datasets` to `iter_epochs` (#20487 )	2021-11-17 18:05:51 -08:00
modin	[client][docs] update docs for new client support in init (#17333 )	2021-08-04 05:31:44 +03:00
.gitignore	[Core][Dataset] adding example for large scale data ingestion (#18998 )	2021-10-11 15:37:09 -07:00
big_data_ingestion.yaml	[Core][Dataset] adding example for large scale data ingestion (#18998 )	2021-10-11 15:37:09 -07:00
dask-on-ray.rst	[Dask-on-Ray] Add Dask config helper, set task-based shuffle by default. (#21114 )	2021-12-17 13:16:37 -08:00
dataset-arch.svg	[data] Cleanup Block type by dropping Generic[T] (#17276 )	2021-07-23 09:23:06 -07:00
dataset-compute-1.png	Dataset doc updates (#19815 )	2021-11-04 18:13:40 -07:00
dataset-execution-model.rst	[data] Instrument most remaining dataset functions and add docs (#21412 )	2022-01-06 17:08:56 -08:00
dataset-loading-1.png	Dataset doc updates (#19815 )	2021-11-04 18:13:40 -07:00
dataset-loading-2.png	Dataset doc updates (#19815 )	2021-11-04 18:13:40 -07:00
dataset-map.svg	Split blocks automatically into 500MB chunks on file read and transformation (#20235 )	2021-11-15 22:25:11 -08:00
dataset-ml-preprocessing.rst	[data] Instrument most remaining dataset functions and add docs (#21412 )	2022-01-06 17:08:56 -08:00
dataset-pipeline-1.svg	Initial implementation of Dataset pipelining and docs (#17309 )	2021-07-28 21:12:01 -07:00
dataset-pipeline-2.svg	Initial implementation of Dataset pipelining and docs (#17309 )	2021-07-28 21:12:01 -07:00
dataset-pipeline-3.svg	Initial implementation of Dataset pipelining and docs (#17309 )	2021-07-28 21:12:01 -07:00
dataset-pipeline.rst	[Train] Rename Ray SGD v2 to Ray Train (#19436 )	2021-10-18 22:27:46 -07:00
dataset-read.svg	Split blocks automatically into 500MB chunks on file read and transformation (#20235 )	2021-11-15 22:25:11 -08:00
dataset-repeat-1.svg	Initial implementation of Dataset pipelining and docs (#17309 )	2021-07-28 21:12:01 -07:00
dataset-repeat-2.svg	Initial implementation of Dataset pipelining and docs (#17309 )	2021-07-28 21:12:01 -07:00
dataset-shuffle.svg	Split blocks automatically into 500MB chunks on file read and transformation (#20235 )	2021-11-15 22:25:11 -08:00
dataset-spill.svg	Split blocks automatically into 500MB chunks on file read and transformation (#20235 )	2021-11-15 22:25:11 -08:00
dataset-tensor-support.rst	[Datasets] Delineate between ref and raw APIs for the Pandas/Arrow integrations. (#18992 )	2021-10-01 13:08:25 -07:00
dataset.rst	[Datasets] Last-mile preprocessing docs. (#20712 )	2021-11-29 23:23:27 -08:00
dataset.svg	[data] Cleanup Block type by dropping Generic[T] (#17276 )	2021-07-23 09:23:06 -07:00
mars-on-ray.rst	First cut at dataset documentation (#16956 )	2021-07-14 23:27:13 -07:00
package-ref.rst	Simple block dataset groupBy (#19435 )	2021-10-19 19:53:13 -07:00
raydp.rst	[Train] Rename Ray SGD v2 to Ray Train (#19436 )	2021-10-18 22:27:46 -07:00