Commit graph

59 commits

Author SHA1 Message Date
Eric Liang
4c04c8d92c
[doc] Rename toc entry for libraries back to "Ray Libraries" (#26485) 2022-07-12 14:23:36 -07:00
Richard Liaw
92efc85b3b
[air/docs] checkpoints (#25901) 2022-07-11 20:40:23 -07:00
Richard Liaw
1abe908c22
[air/docs] improve consistency of getting started (#26247) 2022-07-11 20:16:37 -07:00
Antoni Baum
65ea710e30
[Docs] Update Train user guide to use the new APIs (#26091) 2022-07-11 15:10:10 -07:00
Richard Liaw
5892a76a44
[air/tune] Documentation testing fixes (#26409) 2022-07-09 19:47:21 -07:00
Amog Kamsetty
cc43bcccb4
[AIR] Update TensorflowPredictor to new API (#26215)
Updates TensorflowPredictor to use the new _predict_pandas API.

Also as agreed upon offline, removes the extra configurations from TensorflowPredictor (column selection, concatenation) in favor of having this be done via a Preprocessor.
2022-07-08 13:04:49 -07:00
Antoni Baum
ea94cda1f3
[AIR] Replace train. with session. (#26303)
This PR replaces legacy API calls to `train.` with AIR `session.` in Train code, examples and docs.

Depends on https://github.com/ray-project/ray/pull/25735
2022-07-07 16:29:04 -07:00
Antoni Baum
d1966899bb
[Docs] Small fix to AIR examples descriptions (#26227) 2022-07-05 17:16:56 -07:00
Simon Mo
88a219c7f2
Revert "Revert "[AIR][Serve] Rename ModelWrapperDeployment -> PredictorDeployment"" (#26231) 2022-07-05 13:26:49 -07:00
xwjiang2010
ac831fded4
[air] update documentation to use session.report (#26051)
Update documentation to use `session.report`.

Next steps:
1. Update our internal caller to use `session.report`. Most importantly, CheckpointManager and DataParallelTrainer.
2. Update `get_trial_resources` to use PGF notions to incorporate the requirement of ResourceChangingScheduler. @Yard1 
3. After 2 is done, change all `tune.get_trial_resources` to `session.get_trial_resources`
4. [internal implementation] remove special checkpoint handling logic from huggingface trainer. Optimize the flow for checkpoint conversion with `session.report`.

Co-authored-by: Antoni Baum <antoni.baum@protonmail.com>
2022-06-30 10:37:31 -07:00
matthewdeng
4a21dc31ae
[air] update DummyTrainer to handle DatasetPipelines (#26175)
1. Update `DummyTrainer` to take `num_epochs` instead of `runtime_seconds`.
    1. Ray Train expects equal number of calls to `train.report()`. Different workers may run at different speeds and terminate after different epoch numbers, which causes an error.
2. Add `generate_epochs` to support `DatasetPipeline` when `use_stream_api` is True.
3. Update `__main__` code to support testing different configurations.
2022-06-29 09:32:57 -07:00
Antoni Baum
dc7ed086a5
[AIR] More checkpoint configurability, Result extension (#25943)
This PR:
* Allows the user to set `keep_checkpoints_num` and `checkpoint_score_attr` in `RunConfig` using the `CheckpointStrategy` dataclass
* Adds two new fields to the `Result` object - `best_checkpoints` - a list of saved best checkpoints as determined by `CheckpointingConfig`.
2022-06-29 08:23:29 -07:00
Antoni Baum
128f9e5664
[AIR] Move integration logging callbacks to AIR (#26126)
As the integration logging callbacks are commonly used with AIR Trainers, they should be moved from the tune package to the air package. The old imports will still work, but raise a deprecation warning.
2022-06-28 17:25:19 -07:00
Stephanie Wang
c9be251b7a
Revert "[AIR][Serve] Rename ModelWrapperDeployment -> PredictorDeployment (#25962)" (#26176)
This reverts commit 68692b3464.
2022-06-28 17:07:07 -07:00
Simon Mo
68692b3464
[AIR][Serve] Rename ModelWrapperDeployment -> PredictorDeployment (#25962) 2022-06-28 10:26:10 -07:00
Antoni Baum
0ec198acc2
[AIR] Remove unnecessary pandas from examples (#26009)
Removes unnecessary pandas usage from AIR examples. Helps ensure users do not follow bad practices.
2022-06-24 14:38:23 -07:00
Antoni Baum
94492c2b49
[AIR/Docs] Improve user guide gallery (#26016)
Moves logging to examples, reorders to match TOC, adds missing entry.
2022-06-23 17:51:01 -07:00
Antoni Baum
91dd360f9d
[AIR/train] Move predictors to ray.train (#25769) 2022-06-15 17:02:15 -07:00
xwjiang2010
88d824d067
[air] remove fully_executed from Tune. (#25750) 2022-06-14 22:32:48 -07:00
Kai Fricke
fdf85ea403
[air] Add tutorial to convert existing pytorch code to Ray AIR (#25723) 2022-06-14 18:11:32 -07:00
Eric Liang
ff2cfbe351
[air] Add streaming BatchPredictor support (#25693) 2022-06-13 15:22:36 -07:00
Antoni Baum
5e9a8eb5f6
[AIR/data] Move preprocessors to ray.data (#25599)
Moves ray.air.Preprocessor and ray.air.preprocessors to ray.data to converge on the agreed upon package structure discussed internally.
2022-06-13 12:57:59 -07:00
matthewdeng
88524d8b57
[air] add CustomStatefulPreprocessor (#25497) 2022-06-09 16:54:46 -07:00
Eric Liang
a058a98c5d
[docs] Try to clarify some advantages of bulk ingest in the AIR ingest docs (#25616) 2022-06-09 11:47:22 -07:00
Amog Kamsetty
1316a2d05e
[AIR/Train] Move ray.air.train to ray.train (#25570) 2022-06-08 21:34:18 -07:00
Antoni Baum
7616435ed0
[Docs] Capitalize Ray AIR (#25597) 2022-06-08 14:37:53 -07:00
xwjiang2010
29a063afdf
[air] add feast example (#25417) 2022-06-07 14:55:42 -07:00
xwjiang2010
76b34d4a03
[air] add to_air_checkpoint method for inference only workload. (#25444)
Follow up on our last discussion for supporting piecemeal fashion air users.
Only did for tensorflow for now, want to collect some feedback on API naming, package structure etc and I will add others.
2022-06-07 14:50:39 -07:00
Simon Mo
7471b1fa41
[Serve] [AIR] ModelWrapper improvements and docs (#25003)
* batching collation code and tests

* wip notebook for np and dataframe

* finish content

* reset ray-more-libs changes

* add comments

* run through

* Apply suggestions from code review

Co-authored-by: shrekris-anyscale <92341594+shrekris-anyscale@users.noreply.github.com>

* rename package

* lint

* richard's comment

Co-authored-by: shrekris-anyscale <92341594+shrekris-anyscale@users.noreply.github.com>
2022-06-07 08:53:10 -07:00
Eric Liang
c1afbcb6f4
[air] Enforce API stability annotations for AIR module (#25485) 2022-06-06 22:52:21 -07:00
Eric Liang
78688a0903
Enable streaming ingest in AIR (#25428)
This adds the following options to DatasetConfig, which can be used to enable streaming ingest.

```
    # Whether the dataset should be streamed into memory using pipelined reads.
    # When enabled, get_dataset_shard() returns DatasetPipeline instead of Dataset.
    # The amount of memory to use is controlled by `stream_window_size`.
    # False by default for all datasets.
    use_stream_api: Optional[bool] = None

    # Configure the streaming window size in bytes. A typical value is something like
    # 20% of object store memory. If set to -1, then an infinite window size will be
    # used (similar to bulk ingest). This only has an effect if use_stream_api is set.
    # Set to 1.0 GiB by default.
    stream_window_size: Optional[float] = None

    # Whether to enable global shuffle (per pipeline window in streaming mode). Note
    # that this is an expensive all-to-all operation, and most likely you want to use
    # local shuffle instead.
    # False by default for all datasets.
    global_shuffle: Optional[bool] = None
```
2022-06-06 17:42:15 -07:00
Amog Kamsetty
365fc44754
[AIR] Update to new Predictor interface (#25425)
Updates the Predictor interface to have Pandas as a narrow waist.
2022-06-06 15:41:38 -07:00
Richard Liaw
36aee6a1c4
[air/docs] Update documentation structure (#25475)
Co-authored-by: Eric Liang <ekhliang@gmail.com>
2022-06-06 15:15:11 -07:00
Balaji Veeramani
5e06baa77e
[AIR] Remove /Users/balaji from Torch example (#25515) 2022-06-06 13:13:54 -07:00
Eric Liang
1f509ab331
[air] Add DatasetParallelTrainer.dataset_config for configuring dataset ingest (#25337)
This adds a per-dataset config object to DataParallelTrainer. These configs define how the Dataset should be read into the DataParallelTrainer. It configures the preprocessing, splitting, and ingest strategy per-dataset. DataParallelTrainers declare default DatasetConfigs for each dataset passed in the ``datasets`` argument. Users have the opportunity to selectively override these configs by passing the ``dataset_config`` argument. Trainers can also define user customizable values (e.g., XGBoostTrainer doesn't support streaming ingest).

This PR adds the minimal support for dataset configs. Future PRs will:
- Add support for streaming ingest
- Move this config from DataParallelTrainer to ml.Trainer
2022-06-03 16:32:53 -07:00
Kai Fricke
4b9a89ad90
[air] Move python/ray/ml to python/ray/air (#25449)
The package "ml" should be renamed to "air".

Main question: Keep a `ml.py` with `from ray.air import *` for some level of backwards compatibility?
I'd go for no to force people to use the new structure.
2022-06-03 21:53:44 +01:00
matthewdeng
2e05b62236
[AIR] Preprocessors feature guide (#25302) 2022-06-03 11:43:51 -07:00
Eric Liang
51b295ad74
[docs] Improve Tune + Datasets documentation (#25389) 2022-06-01 21:52:32 -07:00
Balaji Veeramani
f9e7b55123
[AIR] Add Torch image example (#24618) 2022-05-27 16:47:21 -07:00
Amog Kamsetty
e8440cf52b
[AIR] Incremental Learning Example (#24420)
Example for domain incremental learning on Permuted MNIST Dataset with naive strategy
2022-05-26 12:28:28 -07:00
xwjiang2010
ff1fb9b5a2
[air example] train a Keras model on tabular data and serve it. (#24898) 2022-05-25 22:19:35 -07:00
Kai Fricke
d57ba750f5
[docs/air] Move upload example to docs (#25022) 2022-05-21 12:16:33 -07:00
Kai Fricke
e76efffec6
[air/docs] Move RL examples to docs (#24962)
Following #24959, this PR moves the RL examples (online/offline/serving) into the Ray ML docs. It also splits the online and offline parts.
2022-05-20 14:55:01 +01:00
Eric Liang
995309f9a3
[docs] Add AIR data ingest docs (part 1-- bulk loading only) (#24799) 2022-05-19 14:25:47 -07:00
Kai Fricke
9a8c8f4889
[air/docs] Move some examples from ml/examples to docs (#24959)
This moves the basic LightGBM, Sklearn, and XGBoost examples from the examples/ folder to the docs. We keep a symlink in the examples folder.
2022-05-19 14:01:49 +01:00
Antoni Baum
1d5e6d908d
[AIR] HuggingFace Text Classification example (#24402) 2022-05-18 09:35:12 -07:00
Antoni Baum
c74886a55e
[CI] Run doc notebooks in CI (#24816)
Currently, we are not running doc notebooks in CI due to a bazel misconfiguration - we are using `glob` in a top level package in order to get the paths for the notebooks, but those are contained inside subpackages, which glob purposefully ignores. Therefore, the lists of notebooks to run are empty. This PR fixes that by:
* Running the `py_test_run_all_notebooks` macro inside the relevant subpackages
* Editing the `test_myst_doc.py` script to allow for recursive search for the target file, allowing to deal with mismatches between `name` and `data` arguments in `py_test_run_all_notebooks`
* Setting the `allow_empty=False` flag inside `glob` calls in our macros to ensure that this oversight is caught early
* Enabling detection of changes in doc folder for `*.ipynb` and `BUILD` files

This PR also adds a GPU runner for doc tests, allowing one of our examples to pass - and setting the infra for more to come. Finally, a misconfigured path for one set of doc tests is also fixed.
2022-05-17 09:50:42 +01:00
Antoni Baum
7158aeda33
[Datasets] Add Dataset.split_proportionately and ray.ml.train_test_split (#24476)
Adds a Dataset.split_proportionately method that allows the user to split a dataset using proportions. This is a very common use-case for eg. train-test splitting. The implementation is a thin wrapper over Dataset.split_at_indices.

Additionally, this PR adds a ray.ml.train_test_split function intended to provide a familiar API to ML practitioners.
2022-05-16 20:47:29 -07:00
Richard Liaw
41de6acd10
[air] fix-docs (#24792)
Signed-off-by: Richard Liaw <rliaw@berkeley.edu>
2022-05-13 15:58:31 -07:00
Kai Fricke
a92ce9721c
[air] Example to run tuning and analyze results (#24602)
This is a notebook showing how to tune an xgboost model and analyze the results.

Also adds a `get_dataframe()` method to `ResultsGrid` to fetch the trial results.

Depends on #24483 for toctree.
2022-05-13 15:22:36 +01:00