Commit graph

113 commits

Author SHA1 Message Date
Eric Liang
5f18c67ba3
Fix LINT (#26554)
Signed-off-by: Eric Liang <ekhliang@gmail.com>
2022-07-13 23:28:02 -07:00
Jiao
15dbc0362a
[AIR][Docs] Fix torch_image_example (#26453) 2022-07-13 21:59:24 -07:00
Eric Liang
31c8c908f9
[docs] Improve AIR API ref organization (#26530) 2022-07-13 18:05:17 -07:00
Antoni Baum
5ed10ef921
[AIR/CI] Fix Hugging Face notebook example (#26475) 2022-07-13 09:16:42 -07:00
Eric Liang
4c04c8d92c
[doc] Rename toc entry for libraries back to "Ray Libraries" (#26485) 2022-07-12 14:23:36 -07:00
Richard Liaw
92efc85b3b
[air/docs] checkpoints (#25901) 2022-07-11 20:40:23 -07:00
Richard Liaw
1abe908c22
[air/docs] improve consistency of getting started (#26247) 2022-07-11 20:16:37 -07:00
Antoni Baum
65ea710e30
[Docs] Update Train user guide to use the new APIs (#26091) 2022-07-11 15:10:10 -07:00
Richard Liaw
5892a76a44
[air/tune] Documentation testing fixes (#26409) 2022-07-09 19:47:21 -07:00
Amog Kamsetty
cc43bcccb4
[AIR] Update TensorflowPredictor to new API (#26215)
Updates TensorflowPredictor to use the new _predict_pandas API.

Also as agreed upon offline, removes the extra configurations from TensorflowPredictor (column selection, concatenation) in favor of having this be done via a Preprocessor.
2022-07-08 13:04:49 -07:00
Antoni Baum
ea94cda1f3
[AIR] Replace train. with session. (#26303)
This PR replaces legacy API calls to `train.` with AIR `session.` in Train code, examples and docs.

Depends on https://github.com/ray-project/ray/pull/25735
2022-07-07 16:29:04 -07:00
Antoni Baum
d1966899bb
[Docs] Small fix to AIR examples descriptions (#26227) 2022-07-05 17:16:56 -07:00
Simon Mo
88a219c7f2
Revert "Revert "[AIR][Serve] Rename ModelWrapperDeployment -> PredictorDeployment"" (#26231) 2022-07-05 13:26:49 -07:00
xwjiang2010
ac831fded4
[air] update documentation to use session.report (#26051)
Update documentation to use `session.report`.

Next steps:
1. Update our internal caller to use `session.report`. Most importantly, CheckpointManager and DataParallelTrainer.
2. Update `get_trial_resources` to use PGF notions to incorporate the requirement of ResourceChangingScheduler. @Yard1 
3. After 2 is done, change all `tune.get_trial_resources` to `session.get_trial_resources`
4. [internal implementation] remove special checkpoint handling logic from huggingface trainer. Optimize the flow for checkpoint conversion with `session.report`.

Co-authored-by: Antoni Baum <antoni.baum@protonmail.com>
2022-06-30 10:37:31 -07:00
matthewdeng
4a21dc31ae
[air] update DummyTrainer to handle DatasetPipelines (#26175)
1. Update `DummyTrainer` to take `num_epochs` instead of `runtime_seconds`.
    1. Ray Train expects equal number of calls to `train.report()`. Different workers may run at different speeds and terminate after different epoch numbers, which causes an error.
2. Add `generate_epochs` to support `DatasetPipeline` when `use_stream_api` is True.
3. Update `__main__` code to support testing different configurations.
2022-06-29 09:32:57 -07:00
Antoni Baum
dc7ed086a5
[AIR] More checkpoint configurability, Result extension (#25943)
This PR:
* Allows the user to set `keep_checkpoints_num` and `checkpoint_score_attr` in `RunConfig` using the `CheckpointStrategy` dataclass
* Adds two new fields to the `Result` object - `best_checkpoints` - a list of saved best checkpoints as determined by `CheckpointingConfig`.
2022-06-29 08:23:29 -07:00
Antoni Baum
128f9e5664
[AIR] Move integration logging callbacks to AIR (#26126)
As the integration logging callbacks are commonly used with AIR Trainers, they should be moved from the tune package to the air package. The old imports will still work, but raise a deprecation warning.
2022-06-28 17:25:19 -07:00
Stephanie Wang
c9be251b7a
Revert "[AIR][Serve] Rename ModelWrapperDeployment -> PredictorDeployment (#25962)" (#26176)
This reverts commit 68692b3464.
2022-06-28 17:07:07 -07:00
Simon Mo
68692b3464
[AIR][Serve] Rename ModelWrapperDeployment -> PredictorDeployment (#25962) 2022-06-28 10:26:10 -07:00
Antoni Baum
0ec198acc2
[AIR] Remove unnecessary pandas from examples (#26009)
Removes unnecessary pandas usage from AIR examples. Helps ensure users do not follow bad practices.
2022-06-24 14:38:23 -07:00
Antoni Baum
94492c2b49
[AIR/Docs] Improve user guide gallery (#26016)
Moves logging to examples, reorders to match TOC, adds missing entry.
2022-06-23 17:51:01 -07:00
Antoni Baum
91dd360f9d
[AIR/train] Move predictors to ray.train (#25769) 2022-06-15 17:02:15 -07:00
xwjiang2010
88d824d067
[air] remove fully_executed from Tune. (#25750) 2022-06-14 22:32:48 -07:00
Kai Fricke
fdf85ea403
[air] Add tutorial to convert existing pytorch code to Ray AIR (#25723) 2022-06-14 18:11:32 -07:00
Eric Liang
ff2cfbe351
[air] Add streaming BatchPredictor support (#25693) 2022-06-13 15:22:36 -07:00
Antoni Baum
5e9a8eb5f6
[AIR/data] Move preprocessors to ray.data (#25599)
Moves ray.air.Preprocessor and ray.air.preprocessors to ray.data to converge on the agreed upon package structure discussed internally.
2022-06-13 12:57:59 -07:00
matthewdeng
88524d8b57
[air] add CustomStatefulPreprocessor (#25497) 2022-06-09 16:54:46 -07:00
Eric Liang
a058a98c5d
[docs] Try to clarify some advantages of bulk ingest in the AIR ingest docs (#25616) 2022-06-09 11:47:22 -07:00
Amog Kamsetty
1316a2d05e
[AIR/Train] Move ray.air.train to ray.train (#25570) 2022-06-08 21:34:18 -07:00
Antoni Baum
7616435ed0
[Docs] Capitalize Ray AIR (#25597) 2022-06-08 14:37:53 -07:00
xwjiang2010
29a063afdf
[air] add feast example (#25417) 2022-06-07 14:55:42 -07:00
xwjiang2010
76b34d4a03
[air] add to_air_checkpoint method for inference only workload. (#25444)
Follow up on our last discussion for supporting piecemeal fashion air users.
Only did for tensorflow for now, want to collect some feedback on API naming, package structure etc and I will add others.
2022-06-07 14:50:39 -07:00
Simon Mo
7471b1fa41
[Serve] [AIR] ModelWrapper improvements and docs (#25003)
* batching collation code and tests

* wip notebook for np and dataframe

* finish content

* reset ray-more-libs changes

* add comments

* run through

* Apply suggestions from code review

Co-authored-by: shrekris-anyscale <92341594+shrekris-anyscale@users.noreply.github.com>

* rename package

* lint

* richard's comment

Co-authored-by: shrekris-anyscale <92341594+shrekris-anyscale@users.noreply.github.com>
2022-06-07 08:53:10 -07:00
Eric Liang
c1afbcb6f4
[air] Enforce API stability annotations for AIR module (#25485) 2022-06-06 22:52:21 -07:00
Eric Liang
78688a0903
Enable streaming ingest in AIR (#25428)
This adds the following options to DatasetConfig, which can be used to enable streaming ingest.

```
    # Whether the dataset should be streamed into memory using pipelined reads.
    # When enabled, get_dataset_shard() returns DatasetPipeline instead of Dataset.
    # The amount of memory to use is controlled by `stream_window_size`.
    # False by default for all datasets.
    use_stream_api: Optional[bool] = None

    # Configure the streaming window size in bytes. A typical value is something like
    # 20% of object store memory. If set to -1, then an infinite window size will be
    # used (similar to bulk ingest). This only has an effect if use_stream_api is set.
    # Set to 1.0 GiB by default.
    stream_window_size: Optional[float] = None

    # Whether to enable global shuffle (per pipeline window in streaming mode). Note
    # that this is an expensive all-to-all operation, and most likely you want to use
    # local shuffle instead.
    # False by default for all datasets.
    global_shuffle: Optional[bool] = None
```
2022-06-06 17:42:15 -07:00
Amog Kamsetty
365fc44754
[AIR] Update to new Predictor interface (#25425)
Updates the Predictor interface to have Pandas as a narrow waist.
2022-06-06 15:41:38 -07:00
Richard Liaw
36aee6a1c4
[air/docs] Update documentation structure (#25475)
Co-authored-by: Eric Liang <ekhliang@gmail.com>
2022-06-06 15:15:11 -07:00
Balaji Veeramani
5e06baa77e
[AIR] Remove /Users/balaji from Torch example (#25515) 2022-06-06 13:13:54 -07:00
Eric Liang
1f509ab331
[air] Add DatasetParallelTrainer.dataset_config for configuring dataset ingest (#25337)
This adds a per-dataset config object to DataParallelTrainer. These configs define how the Dataset should be read into the DataParallelTrainer. It configures the preprocessing, splitting, and ingest strategy per-dataset. DataParallelTrainers declare default DatasetConfigs for each dataset passed in the ``datasets`` argument. Users have the opportunity to selectively override these configs by passing the ``dataset_config`` argument. Trainers can also define user customizable values (e.g., XGBoostTrainer doesn't support streaming ingest).

This PR adds the minimal support for dataset configs. Future PRs will:
- Add support for streaming ingest
- Move this config from DataParallelTrainer to ml.Trainer
2022-06-03 16:32:53 -07:00
Kai Fricke
4b9a89ad90
[air] Move python/ray/ml to python/ray/air (#25449)
The package "ml" should be renamed to "air".

Main question: Keep a `ml.py` with `from ray.air import *` for some level of backwards compatibility?
I'd go for no to force people to use the new structure.
2022-06-03 21:53:44 +01:00
matthewdeng
2e05b62236
[AIR] Preprocessors feature guide (#25302) 2022-06-03 11:43:51 -07:00
Eric Liang
51b295ad74
[docs] Improve Tune + Datasets documentation (#25389) 2022-06-01 21:52:32 -07:00
Balaji Veeramani
f9e7b55123
[AIR] Add Torch image example (#24618) 2022-05-27 16:47:21 -07:00
Amog Kamsetty
e8440cf52b
[AIR] Incremental Learning Example (#24420)
Example for domain incremental learning on Permuted MNIST Dataset with naive strategy
2022-05-26 12:28:28 -07:00
xwjiang2010
ff1fb9b5a2
[air example] train a Keras model on tabular data and serve it. (#24898) 2022-05-25 22:19:35 -07:00
Kai Fricke
d57ba750f5
[docs/air] Move upload example to docs (#25022) 2022-05-21 12:16:33 -07:00
Kai Fricke
e76efffec6
[air/docs] Move RL examples to docs (#24962)
Following #24959, this PR moves the RL examples (online/offline/serving) into the Ray ML docs. It also splits the online and offline parts.
2022-05-20 14:55:01 +01:00
Eric Liang
995309f9a3
[docs] Add AIR data ingest docs (part 1-- bulk loading only) (#24799) 2022-05-19 14:25:47 -07:00
Kai Fricke
9a8c8f4889
[air/docs] Move some examples from ml/examples to docs (#24959)
This moves the basic LightGBM, Sklearn, and XGBoost examples from the examples/ folder to the docs. We keep a symlink in the examples folder.
2022-05-19 14:01:49 +01:00
Antoni Baum
1d5e6d908d
[AIR] HuggingFace Text Classification example (#24402) 2022-05-18 09:35:12 -07:00