After you train a model, you will often want to use the model to do inference and prediction.
To do so, you can use a Ray AIR Predictor. In this guide, we'll cover how to use the Predictor
on different types of data.
What are predictors?
--------------------
Ray AIR Predictors are a class that loads models from `Checkpoint` to perform inference.
Predictors are used by `BatchPredictor` and `PredictorDeployment` to do large-scale scoring or online inference.
Let's walk through a basic usage of the Predictor. In the below example, we create `Checkpoint` object from a model definition.
Checkpoints can be generated from a variety of different ways -- see the :ref:`Checkpoints <air-checkpoints-doc>` user guide for more details.
The checkpoint then is used to create a framework specific Predictor (in our example, a `TensorflowPredictor`), which then can be used for inference:
..literalinclude:: doc_code/predictors.py
:language:python
:start-after:__use_predictor_start__
:end-before:__use_predictor_end__
Predictors expose a ``predict`` method that accepts an input batch of type ``DataBatchType`` (which is a typing union of different standard Python ecosystem data types, such as Pandas Dataframe or Numpy Array) and outputs predictions of the same type as the input batch.
**Life of a prediction:** Underneath the hood, when the ``Predictor.predict`` method is called the following occurs:
- The input batch is converted into a Pandas DataFrame. Tensor input (like a ``np.ndarray``) will be converted into a single column Pandas Dataframe.
- If there is a :ref:`Preprocessor <air-preprocessor-ref>` saved in the provided :ref:`Checkpoint <air-checkpoint-ref>`, the preprocessor will be used to transform the DataFrame.
- The transformed DataFrame will be passed to the model for inference.
- The predictions will be outputted by ``predict`` in the same type as the original input.
Batch Prediction
----------------
Ray AIR provides a ``BatchPredictor`` utility for large-scale batch inference.
The BatchPredictor takes in a checkpoint and a predictor class and executes
large-scale batch prediction on a given dataset in a parallel/distributed fashion when calling ``predict()``.
..note::
``predict()`` will load the entire given dataset into memory, which may be a problem if your dataset
size is larger than your available cluster memory. See the :ref:`pipelined-prediction` section for a workaround.
..literalinclude:: doc_code/predictors.py
:language:python
:start-after:__batch_prediction_start__
:end-before:__batch_prediction_end__
Additionally, you can compute metrics from the predictions. Do this by:
1. specifying a function for computing metrics
2. using `keep_columns` to keep the label column in the returned dataset
3. using `map_batches` to compute metrics on a batch-by-batch basis
4. Aggregate batch metrics via `mean()`
..literalinclude:: doc_code/predictors.py
:language:python
:start-after:__compute_accuracy_start__
:end-before:__compute_accuracy_end__
Batch Inference Examples
------------------------
Below, we provide examples of using common frameworks to do batch inference for different data types: