ray/doc/source/ray-air/key-concepts.rst

.. _air-key-concepts:

Key Concepts
============

Here, we cover the main concepts in AIR.

.. contents::
    :local:


Preprocessors
-------------

Preprocessors are primitives that can be used to transform input data into features.

A preprocessor can be fitted during Training, and applied at runtime in both Training and Serving on data batches in the same way. AIR comes with a collection of built-in preprocessors, and you can also define your own with simple templates.

Preprocessors operate on :ref:`Ray Datasets <datasets>`, which makes them scalable and compatible with a variety of datasources and dataframe libraries.


.. literalinclude:: doc_code/air_key_concepts.py
    :language: python
    :start-after: __air_preprocessors_start__
    :end-before: __air_preprocessors_end__


Trainers
--------

Trainers are wrapper classes around third-party training frameworks like XGBoost and Pytorch. They are built to help integrate with core Ray actors (for distribution), Ray Tune, and Ray Datasets.

See the documentation on :ref:`Trainers <air-trainer-ref>`.

.. literalinclude:: doc_code/air_key_concepts.py
    :language: python
    :start-after: __air_trainer_start__
    :end-before: __air_trainer_end__


Trainer objects will produce a :ref:`Result <air-results-ref>` object after calling ``.fit()``.  These objects will contain training metrics as long as checkpoints to retrieve the best model.

.. literalinclude:: doc_code/air_key_concepts.py
    :language: python
    :start-after: __air_trainer_output_start__
    :end-before: __air_trainer_output_end__

.. _air-session-key-concepts:

Session
-------

Ray AIR exposes a functional API for users to define training behavior, or for developers to create their own ``Trainer``\s.
In both cases, there is a need for the following interactions:

1. To disseminate information downstream, including ``trial_name``, ``trial_id``, ``trial_resources``, rank information etc.
2. To report information to upstream, including metrics and checkpoint.

To facilitate such interactions, we introduce the :ref:`Session <air-session-ref>` concept.

The session concept exists on several levels: The execution layer (called `Tune Session`) and the Data Parallel training layer
(called `Train Session`).
The following figure shows how these two sessions look like in a Data Parallel training scenario.

.. image:: images/session.svg
   :width: 650px
   :align: center

..
  https://docs.google.com/drawings/d/1g0pv8gqgG29aPEPTcd4BC0LaRNbW1sAkv3H6W1TCp0c/edit


Tuner
-----

:ref:`Tuners <air-tuner-ref>` offer scalable hyperparameter tuning as part of :ref:`Ray Tune <tune-main>`.

Tuners can work seamlessly with any Trainer but also can support arbitrary training functions.

.. literalinclude:: doc_code/air_key_concepts.py
    :language: python
    :start-after: __air_tuner_start__
    :end-before: __air_tuner_end__


Batch Predictor
---------------

You can take a trained model and do batch inference using the BatchPredictor object.

.. literalinclude:: doc_code/air_key_concepts.py
    :language: python
    :start-after: __air_batch_predictor_start__
    :end-before: __air_batch_predictor_end__

.. _air-key-concepts-online-inference:

Online Inference
----------------

Deploy the model as an inference service by using Ray Serve and the ``PredictorDeployment`` class.

.. literalinclude:: doc_code/air_key_concepts.py
    :language: python
    :start-after: __air_deploy_start__
    :end-before: __air_deploy_end__

After deploying the service, you can send requests to it.

.. literalinclude:: doc_code/air_key_concepts.py
    :language: python
    :start-after: __air_inference_start__
    :end-before: __air_inference_end__
[docs] Add initial AIR documentation (#24483) Co-authored-by: Kai Fricke <krfricke@users.noreply.github.com> Co-authored-by: Eric Liang <ekhliang@gmail.com> 2022-05-13 01:29:59 -07:00			`.. _air-key-concepts:`

			`Key Concepts`
			`============`

			`Here, we cover the main concepts in AIR.`

			`.. contents::`
			`:local:`


			`Preprocessors`
			`-------------`

			`Preprocessors are primitives that can be used to transform input data into features.`

			`A preprocessor can be fitted during Training, and applied at runtime in both Training and Serving on data batches in the same way. AIR comes with a collection of built-in preprocessors, and you can also define your own with simple templates.`

			Preprocessors operate on :ref:`Ray Datasets <datasets>`, which makes them scalable and compatible with a variety of datasources and dataframe libraries.


			`.. literalinclude:: doc_code/air_key_concepts.py`
			`:language: python`
			`:start-after: __air_preprocessors_start__`
			`:end-before: __air_preprocessors_end__`


			`Trainers`
			`--------`

			`Trainers are wrapper classes around third-party training frameworks like XGBoost and Pytorch. They are built to help integrate with core Ray actors (for distribution), Ray Tune, and Ray Datasets.`

			See the documentation on :ref:`Trainers <air-trainer-ref>`.

			`.. literalinclude:: doc_code/air_key_concepts.py`
			`:language: python`
			`:start-after: __air_trainer_start__`
			`:end-before: __air_trainer_end__`



[AIR/Docs] Small improvements to Train user guide (#26577) Co-authored-by: matthewdeng <matthew.j.deng@gmail.com> 2022-07-17 01:51:17 +02:00			Trainer objects will produce a :ref:`Result <air-results-ref>` object after calling ``.fit()``. These objects will contain training metrics as long as checkpoints to retrieve the best model.
[docs] Add initial AIR documentation (#24483) Co-authored-by: Kai Fricke <krfricke@users.noreply.github.com> Co-authored-by: Eric Liang <ekhliang@gmail.com> 2022-05-13 01:29:59 -07:00
			`.. literalinclude:: doc_code/air_key_concepts.py`
			`:language: python`
			`:start-after: __air_trainer_output_start__`
			`:end-before: __air_trainer_output_end__`

[AIR/Docs] Small improvements to Train user guide (#26577) Co-authored-by: matthewdeng <matthew.j.deng@gmail.com> 2022-07-17 01:51:17 +02:00			`.. _air-session-key-concepts:`
[docs] Add initial AIR documentation (#24483) Co-authored-by: Kai Fricke <krfricke@users.noreply.github.com> Co-authored-by: Eric Liang <ekhliang@gmail.com> 2022-05-13 01:29:59 -07:00
[air] update documentation to use `session.report` (#26051) Update documentation to use `session.report`. Next steps: 1. Update our internal caller to use `session.report`. Most importantly, CheckpointManager and DataParallelTrainer. 2. Update `get_trial_resources` to use PGF notions to incorporate the requirement of ResourceChangingScheduler. @Yard1 3. After 2 is done, change all `tune.get_trial_resources` to `session.get_trial_resources` 4. [internal implementation] remove special checkpoint handling logic from huggingface trainer. Optimize the flow for checkpoint conversion with `session.report`. Co-authored-by: Antoni Baum <antoni.baum@protonmail.com> 2022-06-30 10:37:31 -07:00			`Session`
			`-------`

			Ray AIR exposes a functional API for users to define training behavior, or for developers to create their own ``Trainer``\s.
			`In both cases, there is a need for the following interactions:`

			1. To disseminate information downstream, including ``trial_name``, ``trial_id``, ``trial_resources``, rank information etc.
			`2. To report information to upstream, including metrics and checkpoint.`

			To facilitate such interactions, we introduce the :ref:`Session <air-session-ref>` concept.

			The session concept exists on several levels: The execution layer (called `Tune Session`) and the Data Parallel training layer
			(called `Train Session`).
			`The following figure shows how these two sessions look like in a Data Parallel training scenario.`

			`.. image:: images/session.svg`
			`:width: 650px`
			`:align: center`

			`..`
			`https://docs.google.com/drawings/d/1g0pv8gqgG29aPEPTcd4BC0LaRNbW1sAkv3H6W1TCp0c/edit`


[docs] Add initial AIR documentation (#24483) Co-authored-by: Kai Fricke <krfricke@users.noreply.github.com> Co-authored-by: Eric Liang <ekhliang@gmail.com> 2022-05-13 01:29:59 -07:00			`Tuner`
			`-----`

			:ref:`Tuners <air-tuner-ref>` offer scalable hyperparameter tuning as part of :ref:`Ray Tune <tune-main>`.

			`Tuners can work seamlessly with any Trainer but also can support arbitrary training functions.`

			`.. literalinclude:: doc_code/air_key_concepts.py`
			`:language: python`
			`:start-after: __air_tuner_start__`
			`:end-before: __air_tuner_end__`



			`Batch Predictor`
			`---------------`

			`You can take a trained model and do batch inference using the BatchPredictor object.`

			`.. literalinclude:: doc_code/air_key_concepts.py`
			`:language: python`
			`:start-after: __air_batch_predictor_start__`
			`:end-before: __air_batch_predictor_end__`

[AIR] Preprocessors feature guide (#25302) 2022-06-03 11:43:51 -07:00			`.. _air-key-concepts-online-inference:`
[docs] Add initial AIR documentation (#24483) Co-authored-by: Kai Fricke <krfricke@users.noreply.github.com> Co-authored-by: Eric Liang <ekhliang@gmail.com> 2022-05-13 01:29:59 -07:00
			`Online Inference`
			`----------------`

Revert "Revert "[AIR][Serve] Rename ModelWrapperDeployment -> PredictorDeployment"" (#26231) 2022-07-05 13:26:49 -07:00			Deploy the model as an inference service by using Ray Serve and the ``PredictorDeployment`` class.
[docs] Add initial AIR documentation (#24483) Co-authored-by: Kai Fricke <krfricke@users.noreply.github.com> Co-authored-by: Eric Liang <ekhliang@gmail.com> 2022-05-13 01:29:59 -07:00
			`.. literalinclude:: doc_code/air_key_concepts.py`
			`:language: python`
			`:start-after: __air_deploy_start__`
			`:end-before: __air_deploy_end__`

			`After deploying the service, you can send requests to it.`

			`.. literalinclude:: doc_code/air_key_concepts.py`
			`:language: python`
			`:start-after: __air_inference_start__`
			`:end-before: __air_inference_end__`