ray/doc/source/ray-air/doc_code
Eric Liang 1f509ab331
[air] Add DatasetParallelTrainer.dataset_config for configuring dataset ingest (#25337)
This adds a per-dataset config object to DataParallelTrainer. These configs define how the Dataset should be read into the DataParallelTrainer. It configures the preprocessing, splitting, and ingest strategy per-dataset. DataParallelTrainers declare default DatasetConfigs for each dataset passed in the ``datasets`` argument. Users have the opportunity to selectively override these configs by passing the ``dataset_config`` argument. Trainers can also define user customizable values (e.g., XGBoostTrainer doesn't support streaming ingest).

This PR adds the minimal support for dataset configs. Future PRs will:
- Add support for streaming ingest
- Move this config from DataParallelTrainer to ml.Trainer
2022-06-03 16:32:53 -07:00
..
air_ingest.py [air] Add DatasetParallelTrainer.dataset_config for configuring dataset ingest (#25337) 2022-06-03 16:32:53 -07:00
air_key_concepts.py [air] Move python/ray/ml to python/ray/air (#25449) 2022-06-03 21:53:44 +01:00
output.txt [docs] Add AIR data ingest docs (part 1-- bulk loading only) (#24799) 2022-05-19 14:25:47 -07:00
preprocessors.py [air] Move python/ray/ml to python/ray/air (#25449) 2022-06-03 21:53:44 +01:00
pytorch_starter.py [air] Move python/ray/ml to python/ray/air (#25449) 2022-06-03 21:53:44 +01:00
tf_starter.py [air] Move python/ray/ml to python/ray/air (#25449) 2022-06-03 21:53:44 +01:00
xgboost_starter.py [air] Move python/ray/ml to python/ray/air (#25449) 2022-06-03 21:53:44 +01:00