ray/doc/source/data at da9581b7465c5e1e4903595b4107ed1fa601920f - hiro/ray

hiro/ray

mirror of https://github.com/vale981/ray synced 2025-03-10 13:26:39 -04:00

History

matthewdeng b048c6f659 [data] set iter_batches default batch_size #26869 Why are these changes needed? Consumers (e.g. Train) may expect generated batches to be of the same size. Prior to this change, the default behavior would be for each batch to be one block, which may be of different sizes. Changes Set default batch_size to 256. This was chosen to be a sensible default for training workloads, which is intentionally different from the existing default batch_size value for Dataset.map_batches. Update docs for Dataset.iter_batches, Dataset.map_batches, and DatasetPipeline.iter_batches to be consistent. Updated tests and examples to explicitly pass in batch_size=None as these tests were intentionally testing block iteration, and there are other tests that test explicit batch sizes.		2022-07-23 13:44:53 -07:00
..
doc_code	[data] set iter_batches default batch_size #26869	2022-07-23 13:44:53 -07:00
examples	[core] ray.init defaults to an existing Ray instance if there is one (#26678 )	2022-07-23 11:27:22 -07:00
images	[docs] Cleanup the Datasets key concept docs (#26908 )	2022-07-22 23:30:54 -07:00
modin	Fix broken links in documentation and put linkcheck linter in place on CI (#23340 )	2022-03-18 21:02:52 -07:00
accessing-datasets.rst	[Datasets] Overhaul "Accessing Datasets" feature guide. (#24963 )	2022-05-19 12:50:00 -07:00
advanced-pipelines.rst	[data] [docs] Doc audit-- rebalance basic vs advanced materials (#25262 )	2022-06-01 13:50:46 -07:00
big_data_ingestion.yaml	Revert "[docs] Clean up doc structure (first part) (#21667 )" (#21763 )	2022-01-20 15:30:56 -08:00
creating-datasets.rst	[Datasets] Autodetect dataset parallelism based on available resources and data size (#25883 )	2022-07-12 21:08:49 -07:00
custom-data.rst	[Datasets] Overhaul of "Creating Datasets" feature guide. (#24831 )	2022-05-17 16:23:42 -07:00
dask-on-ray.rst	Update dask version for Ray 1.12.0 (#23197 )	2022-03-15 19:22:19 -07:00
dataset-ml-preprocessing.rst	[Datasets] Update docs for drop_columns and fix typos (#26317 )	2022-07-07 17:17:33 -07:00
dataset-tensor-support.rst	[Datasets] Unrevert "[Datasets] [Tensor Story - 1/2] Automatically provide tensor views to UDFs and infer tensor blocks for pure-tensor datasets. (#25031 )" (#25531 )	2022-06-08 10:33:25 -07:00
dataset.rst	[docs] Cleanup the Datasets key concept docs (#26908 )	2022-07-22 23:30:54 -07:00
faq.rst	docs: Fix a few typos (#26556 )	2022-07-14 12:38:33 -07:00
getting-started.rst	[Datasets] [Tensor Story - 2/2] Add `"numpy"` batch format for batch mapping and batch consumption. (#24870 )	2022-06-17 16:01:02 -07:00
integrations.rst	Revamp the Getting Started page for Dataset (#24860 )	2022-05-18 13:46:23 -07:00
key-concepts.rst	[docs] Cleanup the Datasets key concept docs (#26908 )	2022-07-22 23:30:54 -07:00
mars-on-ray.rst	[Datasets] Integrate Mars-on-Ray with Datasets; improve docs and add tests (#23402 )	2022-04-29 09:43:52 -07:00
memory-management.rst	[data] [docs] Doc audit-- rebalance basic vs advanced materials (#25262 )	2022-06-01 13:50:46 -07:00
package-ref.rst	[Datasets] Add `ImageFolderDatasource` (#24641 )	2022-07-15 22:43:23 -07:00
performance-tips.rst	[docs] Cleanup the Datasets key concept docs (#26908 )	2022-07-22 23:30:54 -07:00
pipelining-compute.rst	[data] [docs] Doc audit-- rebalance basic vs advanced materials (#25262 )	2022-06-01 13:50:46 -07:00
random-access.rst	[Datasets] Overhaul "Accessing Datasets" feature guide. (#24963 )	2022-05-19 12:50:00 -07:00
raydp.rst	[Docs] Ray Data docs target state (#21931 )	2022-01-27 13:14:36 -08:00
saving-datasets.rst	Revamp the Transforming Datasets user guide (#25033 )	2022-05-20 19:25:06 -07:00
transforming-datasets.rst	[Datasets] [Tensor Story - 2/2] Add `"numpy"` batch format for batch mapping and batch consumption. (#24870 )	2022-06-17 16:01:02 -07:00
user-guide.rst	[data] [docs] Doc audit-- rebalance basic vs advanced materials (#25262 )	2022-06-01 13:50:46 -07:00