mirror of
https://github.com/vale981/ray
synced 2025-03-05 10:01:43 -05:00
[Docs] [Train] Update Train API reference and docs (#28192)
Signed-off-by: Amog Kamsetty amogkamsetty@yahoo.com Adds back more Ray Train APIs to Ray Train docs. Also makes updates to the user guide for better references.
This commit is contained in:
parent
118b76218a
commit
b83f10dbde
4 changed files with 182 additions and 53 deletions
|
@ -122,6 +122,8 @@ Training Result
|
||||||
.. automodule:: ray.air.result
|
.. automodule:: ray.air.result
|
||||||
:members:
|
:members:
|
||||||
|
|
||||||
|
.. _air-session-ref:
|
||||||
|
|
||||||
Training Session
|
Training Session
|
||||||
################
|
################
|
||||||
|
|
||||||
|
@ -199,14 +201,17 @@ XGBoost
|
||||||
.. autoclass:: ray.train.xgboost.XGBoostTrainer
|
.. autoclass:: ray.train.xgboost.XGBoostTrainer
|
||||||
:members:
|
:members:
|
||||||
:show-inheritance:
|
:show-inheritance:
|
||||||
|
:noindex:
|
||||||
|
|
||||||
.. automethod:: __init__
|
.. automethod:: __init__
|
||||||
|
:noindex:
|
||||||
|
|
||||||
|
|
||||||
.. automodule:: ray.train.xgboost
|
.. automodule:: ray.train.xgboost
|
||||||
:members:
|
:members:
|
||||||
:exclude-members: XGBoostTrainer
|
:exclude-members: XGBoostTrainer
|
||||||
:show-inheritance:
|
:show-inheritance:
|
||||||
|
:noindex:
|
||||||
|
|
||||||
LightGBM
|
LightGBM
|
||||||
########
|
########
|
||||||
|
@ -214,14 +219,17 @@ LightGBM
|
||||||
.. autoclass:: ray.train.lightgbm.LightGBMTrainer
|
.. autoclass:: ray.train.lightgbm.LightGBMTrainer
|
||||||
:members:
|
:members:
|
||||||
:show-inheritance:
|
:show-inheritance:
|
||||||
|
:noindex:
|
||||||
|
|
||||||
.. automethod:: __init__
|
.. automethod:: __init__
|
||||||
|
:noindex:
|
||||||
|
|
||||||
|
|
||||||
.. automodule:: ray.train.lightgbm
|
.. automodule:: ray.train.lightgbm
|
||||||
:members:
|
:members:
|
||||||
:exclude-members: LightGBMTrainer
|
:exclude-members: LightGBMTrainer
|
||||||
:show-inheritance:
|
:show-inheritance:
|
||||||
|
:noindex:
|
||||||
|
|
||||||
TensorFlow
|
TensorFlow
|
||||||
##########
|
##########
|
||||||
|
@ -229,14 +237,17 @@ TensorFlow
|
||||||
.. autoclass:: ray.train.tensorflow.TensorflowTrainer
|
.. autoclass:: ray.train.tensorflow.TensorflowTrainer
|
||||||
:members:
|
:members:
|
||||||
:show-inheritance:
|
:show-inheritance:
|
||||||
|
:noindex:
|
||||||
|
|
||||||
.. automethod:: __init__
|
.. automethod:: __init__
|
||||||
|
:noindex:
|
||||||
|
|
||||||
|
|
||||||
.. automodule:: ray.train.tensorflow
|
.. automodule:: ray.train.tensorflow
|
||||||
:members:
|
:members:
|
||||||
:exclude-members: TensorflowTrainer
|
:exclude-members: TensorflowTrainer
|
||||||
:show-inheritance:
|
:show-inheritance:
|
||||||
|
:noindex:
|
||||||
|
|
||||||
.. _air-pytorch-ref:
|
.. _air-pytorch-ref:
|
||||||
|
|
||||||
|
@ -246,14 +257,17 @@ PyTorch
|
||||||
.. autoclass:: ray.train.torch.TorchTrainer
|
.. autoclass:: ray.train.torch.TorchTrainer
|
||||||
:members:
|
:members:
|
||||||
:show-inheritance:
|
:show-inheritance:
|
||||||
|
:noindex:
|
||||||
|
|
||||||
.. automethod:: __init__
|
.. automethod:: __init__
|
||||||
|
:noindex:
|
||||||
|
|
||||||
|
|
||||||
.. automodule:: ray.train.torch
|
.. automodule:: ray.train.torch
|
||||||
:members:
|
:members:
|
||||||
:exclude-members: TorchTrainer
|
:exclude-members: TorchTrainer
|
||||||
:show-inheritance:
|
:show-inheritance:
|
||||||
|
:noindex:
|
||||||
|
|
||||||
Horovod
|
Horovod
|
||||||
#######
|
#######
|
||||||
|
@ -261,14 +275,17 @@ Horovod
|
||||||
.. autoclass:: ray.train.horovod.HorovodTrainer
|
.. autoclass:: ray.train.horovod.HorovodTrainer
|
||||||
:members:
|
:members:
|
||||||
:show-inheritance:
|
:show-inheritance:
|
||||||
|
:noindex:
|
||||||
|
|
||||||
.. automethod:: __init__
|
.. automethod:: __init__
|
||||||
|
:noindex:
|
||||||
|
|
||||||
|
|
||||||
.. automodule:: ray.train.horovod
|
.. automodule:: ray.train.horovod
|
||||||
:members:
|
:members:
|
||||||
:exclude-members: HorovodTrainer
|
:exclude-members: HorovodTrainer
|
||||||
:show-inheritance:
|
:show-inheritance:
|
||||||
|
:noindex:
|
||||||
|
|
||||||
HuggingFace
|
HuggingFace
|
||||||
###########
|
###########
|
||||||
|
@ -276,14 +293,17 @@ HuggingFace
|
||||||
.. autoclass:: ray.train.huggingface.HuggingFaceTrainer
|
.. autoclass:: ray.train.huggingface.HuggingFaceTrainer
|
||||||
:members:
|
:members:
|
||||||
:show-inheritance:
|
:show-inheritance:
|
||||||
|
:noindex:
|
||||||
|
|
||||||
.. automethod:: __init__
|
.. automethod:: __init__
|
||||||
|
:noindex:
|
||||||
|
|
||||||
|
|
||||||
.. automodule:: ray.train.huggingface
|
.. automodule:: ray.train.huggingface
|
||||||
:members:
|
:members:
|
||||||
:exclude-members: HuggingFaceTrainer
|
:exclude-members: HuggingFaceTrainer
|
||||||
:show-inheritance:
|
:show-inheritance:
|
||||||
|
:noindex:
|
||||||
|
|
||||||
Scikit-Learn
|
Scikit-Learn
|
||||||
############
|
############
|
||||||
|
@ -291,14 +311,17 @@ Scikit-Learn
|
||||||
.. autoclass:: ray.train.sklearn.SklearnTrainer
|
.. autoclass:: ray.train.sklearn.SklearnTrainer
|
||||||
:members:
|
:members:
|
||||||
:show-inheritance:
|
:show-inheritance:
|
||||||
|
:noindex:
|
||||||
|
|
||||||
.. automethod:: __init__
|
.. automethod:: __init__
|
||||||
|
:noindex:
|
||||||
|
|
||||||
|
|
||||||
.. automodule:: ray.train.sklearn
|
.. automodule:: ray.train.sklearn
|
||||||
:members:
|
:members:
|
||||||
:exclude-members: SklearnTrainer
|
:exclude-members: SklearnTrainer
|
||||||
:show-inheritance:
|
:show-inheritance:
|
||||||
|
:noindex:
|
||||||
|
|
||||||
|
|
||||||
Reinforcement Learning (RLlib)
|
Reinforcement Learning (RLlib)
|
||||||
|
@ -307,6 +330,7 @@ Reinforcement Learning (RLlib)
|
||||||
.. automodule:: ray.train.rl
|
.. automodule:: ray.train.rl
|
||||||
:members:
|
:members:
|
||||||
:show-inheritance:
|
:show-inheritance:
|
||||||
|
:noindex:
|
||||||
|
|
||||||
.. _air-builtin-callbacks:
|
.. _air-builtin-callbacks:
|
||||||
|
|
||||||
|
@ -333,5 +357,3 @@ Weights and Biases
|
||||||
##################
|
##################
|
||||||
|
|
||||||
.. autoclass:: ray.air.callbacks.wandb.WandbLoggerCallback
|
.. autoclass:: ray.air.callbacks.wandb.WandbLoggerCallback
|
||||||
|
|
||||||
.. _air-session-ref:
|
|
||||||
|
|
|
@ -2,54 +2,159 @@
|
||||||
|
|
||||||
Ray Train API
|
Ray Train API
|
||||||
=============
|
=============
|
||||||
|
This page covers framework specific integrations with Ray Train and Ray Train Developer APIs.
|
||||||
|
|
||||||
This page covers advanced configurations for specific frameworks using Train.
|
For core Ray AIR APIs, take a look at the :ref:`AIR Trainer package reference <air-trainer-ref>`.
|
||||||
|
|
||||||
For different high level trainers and their usage, take a look at the :ref:`AIR Trainer package reference <air-trainer-ref>`.
|
.. _train-integration-api:
|
||||||
|
|
||||||
.. _train-api-backend-config:
|
Trainer and Predictor Integrations
|
||||||
|
----------------------------------
|
||||||
|
|
||||||
Backend Configurations
|
XGBoost
|
||||||
----------------------
|
~~~~~~~
|
||||||
|
|
||||||
.. _train-api-torch-config:
|
.. autoclass:: ray.train.xgboost.XGBoostTrainer
|
||||||
|
:members:
|
||||||
|
:show-inheritance:
|
||||||
|
|
||||||
TorchConfig
|
.. automethod:: __init__
|
||||||
|
|
||||||
|
|
||||||
|
.. automodule:: ray.train.xgboost
|
||||||
|
:members:
|
||||||
|
:exclude-members: XGBoostTrainer
|
||||||
|
:show-inheritance:
|
||||||
|
|
||||||
|
LightGBM
|
||||||
|
~~~~~~~~
|
||||||
|
|
||||||
|
.. autoclass:: ray.train.lightgbm.LightGBMTrainer
|
||||||
|
:members:
|
||||||
|
:show-inheritance:
|
||||||
|
|
||||||
|
.. automethod:: __init__
|
||||||
|
|
||||||
|
|
||||||
|
.. automodule:: ray.train.lightgbm
|
||||||
|
:members:
|
||||||
|
:exclude-members: LightGBMTrainer
|
||||||
|
:show-inheritance:
|
||||||
|
|
||||||
|
TensorFlow
|
||||||
|
~~~~~~~~~~
|
||||||
|
|
||||||
|
.. autoclass:: ray.train.tensorflow.TensorflowTrainer
|
||||||
|
:members:
|
||||||
|
:show-inheritance:
|
||||||
|
|
||||||
|
.. automethod:: __init__
|
||||||
|
|
||||||
|
|
||||||
|
.. automodule:: ray.train.tensorflow
|
||||||
|
:members:
|
||||||
|
:exclude-members: TensorflowTrainer
|
||||||
|
:show-inheritance:
|
||||||
|
|
||||||
|
PyTorch
|
||||||
|
~~~~~~~
|
||||||
|
|
||||||
|
.. autoclass:: ray.train.torch.TorchTrainer
|
||||||
|
:members:
|
||||||
|
:show-inheritance:
|
||||||
|
|
||||||
|
.. automethod:: __init__
|
||||||
|
|
||||||
|
|
||||||
|
.. automodule:: ray.train.torch
|
||||||
|
:members:
|
||||||
|
:exclude-members: TorchTrainer
|
||||||
|
:show-inheritance:
|
||||||
|
|
||||||
|
Horovod
|
||||||
|
~~~~~~~
|
||||||
|
|
||||||
|
.. autoclass:: ray.train.horovod.HorovodTrainer
|
||||||
|
:members:
|
||||||
|
:show-inheritance:
|
||||||
|
|
||||||
|
.. automethod:: __init__
|
||||||
|
|
||||||
|
|
||||||
|
.. automodule:: ray.train.horovod
|
||||||
|
:members:
|
||||||
|
:exclude-members: HorovodTrainer
|
||||||
|
:show-inheritance:
|
||||||
|
|
||||||
|
HuggingFace
|
||||||
~~~~~~~~~~~
|
~~~~~~~~~~~
|
||||||
|
|
||||||
.. autoclass:: ray.train.torch.TorchConfig
|
.. autoclass:: ray.train.huggingface.HuggingFaceTrainer
|
||||||
|
:members:
|
||||||
|
:show-inheritance:
|
||||||
|
|
||||||
|
.. automethod:: __init__
|
||||||
|
|
||||||
|
|
||||||
|
.. automodule:: ray.train.huggingface
|
||||||
|
:members:
|
||||||
|
:exclude-members: HuggingFaceTrainer
|
||||||
|
:show-inheritance:
|
||||||
|
|
||||||
|
Scikit-Learn
|
||||||
|
~~~~~~~~~~~~
|
||||||
|
|
||||||
|
.. autoclass:: ray.train.sklearn.SklearnTrainer
|
||||||
|
:members:
|
||||||
|
:show-inheritance:
|
||||||
|
|
||||||
|
.. automethod:: __init__
|
||||||
|
|
||||||
|
|
||||||
|
.. automodule:: ray.train.sklearn
|
||||||
|
:members:
|
||||||
|
:exclude-members: SklearnTrainer
|
||||||
|
:show-inheritance:
|
||||||
|
|
||||||
|
|
||||||
|
Reinforcement Learning (RLlib)
|
||||||
|
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||||
|
|
||||||
|
.. automodule:: ray.train.rl
|
||||||
|
:members:
|
||||||
|
:show-inheritance:
|
||||||
|
|
||||||
|
|
||||||
|
Base Classes (Developer APIs)
|
||||||
|
-----------------------------
|
||||||
|
.. autoclass:: ray.train.trainer.BaseTrainer
|
||||||
|
:members:
|
||||||
:noindex:
|
:noindex:
|
||||||
|
|
||||||
.. _train-api-tensorflow-config:
|
.. automethod:: __init__
|
||||||
|
|
||||||
TensorflowConfig
|
|
||||||
~~~~~~~~~~~~~~~~
|
|
||||||
|
|
||||||
.. autoclass:: ray.train.tensorflow.TensorflowConfig
|
|
||||||
:noindex:
|
:noindex:
|
||||||
|
|
||||||
.. _train-api-horovod-config:
|
.. autoclass:: ray.train.data_parallel_trainer.DataParallelTrainer
|
||||||
|
:members:
|
||||||
HorovodConfig
|
:show-inheritance:
|
||||||
~~~~~~~~~~~~~
|
|
||||||
|
|
||||||
.. autoclass:: ray.train.horovod.HorovodConfig
|
|
||||||
:noindex:
|
:noindex:
|
||||||
|
|
||||||
.. _train-api-backend-interfaces:
|
.. automethod:: __init__
|
||||||
|
:noindex:
|
||||||
|
|
||||||
Backend interfaces (for developers only)
|
.. autoclass:: ray.train.gbdt_trainer.GBDTTrainer
|
||||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
:members:
|
||||||
|
:show-inheritance:
|
||||||
|
:noindex:
|
||||||
|
|
||||||
Backend
|
.. automethod:: __init__
|
||||||
+++++++
|
:noindex:
|
||||||
|
|
||||||
.. autoclass:: ray.train.backend.Backend
|
.. autoclass:: ray.train.backend.Backend
|
||||||
|
:members:
|
||||||
BackendConfig
|
|
||||||
+++++++++++++
|
|
||||||
|
|
||||||
.. autoclass:: ray.train.backend.BackendConfig
|
.. autoclass:: ray.train.backend.BackendConfig
|
||||||
|
:members:
|
||||||
|
|
||||||
|
|
||||||
Deprecated APIs
|
Deprecated APIs
|
||||||
|
|
|
@ -57,10 +57,11 @@ training.
|
||||||
to automatically prepare your model and data for distributed training.
|
to automatically prepare your model and data for distributed training.
|
||||||
|
|
||||||
.. note::
|
.. note::
|
||||||
Ray Train will still work even if you don't use the ``prepare_model`` and ``prepare_data_loader`` utilities below,
|
Ray Train will still work even if you don't use the :func:`ray.train.torch.prepare_model`
|
||||||
|
and :func:`ray.train.torch.prepare_data_loader` utilities below,
|
||||||
and instead handle the logic directly inside your training function.
|
and instead handle the logic directly inside your training function.
|
||||||
|
|
||||||
First, use the ``prepare_model`` function to automatically move your model to the right device and wrap it in
|
First, use the :func:~ray.train.torch.prepare_model` function to automatically move your model to the right device and wrap it in
|
||||||
``DistributedDataParallel``
|
``DistributedDataParallel``
|
||||||
|
|
||||||
.. code-block:: diff
|
.. code-block:: diff
|
||||||
|
@ -89,7 +90,8 @@ training.
|
||||||
|
|
||||||
|
|
||||||
Then, use the ``prepare_data_loader`` function to automatically add a ``DistributedSampler`` to your ``DataLoader``
|
Then, use the ``prepare_data_loader`` function to automatically add a ``DistributedSampler`` to your ``DataLoader``
|
||||||
and move the batches to the right device.
|
and move the batches to the right device. This step is not necessary if you are passing in Ray Datasets to your Trainer
|
||||||
|
(see :ref:`train-datasets`)
|
||||||
|
|
||||||
.. code-block:: diff
|
.. code-block:: diff
|
||||||
|
|
||||||
|
@ -216,7 +218,7 @@ with one of the following:
|
||||||
scaling_config=ScalingConfig(use_gpu=use_gpu, num_workers=2)
|
scaling_config=ScalingConfig(use_gpu=use_gpu, num_workers=2)
|
||||||
)
|
)
|
||||||
|
|
||||||
To customize the backend setup, you can use a :ref:`train-api-backend-config` object.
|
To customize the backend setup, you can use the :ref:`framework-specific config objects <train-integration-api>`.
|
||||||
|
|
||||||
.. tabbed:: PyTorch
|
.. tabbed:: PyTorch
|
||||||
|
|
||||||
|
@ -258,7 +260,7 @@ To customize the backend setup, you can use a :ref:`train-api-backend-config` ob
|
||||||
scaling_config=ScalingConfig(num_workers=2),
|
scaling_config=ScalingConfig(num_workers=2),
|
||||||
)
|
)
|
||||||
|
|
||||||
For more configurability, please reference the :class:`BaseTrainer` API.
|
For more configurability, please reference the :py:class:`~ray.train.data_parallel_trainer.DataParallelTrainer` API.
|
||||||
|
|
||||||
Run training function
|
Run training function
|
||||||
~~~~~~~~~~~~~~~~~~~~~
|
~~~~~~~~~~~~~~~~~~~~~
|
||||||
|
@ -327,7 +329,7 @@ Accessing Training Results
|
||||||
|
|
||||||
.. TODO(ml-team) Flesh this section out.
|
.. TODO(ml-team) Flesh this section out.
|
||||||
|
|
||||||
The return of a ``Trainer.fit`` is a :class:`Result` object, containing
|
The return of a ``Trainer.fit`` is a :py:class:`~ray.air.result.Result` object, containing
|
||||||
information about the training run. You can access it to obtain saved checkpoints,
|
information about the training run. You can access it to obtain saved checkpoints,
|
||||||
metrics and other relevant data.
|
metrics and other relevant data.
|
||||||
|
|
||||||
|
@ -370,7 +372,7 @@ For example, you can:
|
||||||
|
|
||||||
print(result.metrics_dataframe)
|
print(result.metrics_dataframe)
|
||||||
|
|
||||||
* Obtain the :class:`Checkpoint`, used for resuming training, prediction and serving.
|
* Obtain the :py:class:`~ray.air.checkpoint.Checkpoint`, used for resuming training, prediction and serving.
|
||||||
|
|
||||||
.. code-block:: python
|
.. code-block:: python
|
||||||
|
|
||||||
|
@ -385,7 +387,7 @@ Log Directory Structure
|
||||||
Each ``Trainer`` will have a local directory created for logs and checkpoints.
|
Each ``Trainer`` will have a local directory created for logs and checkpoints.
|
||||||
|
|
||||||
You can obtain the path to the directory by accessing the ``log_dir`` attribute
|
You can obtain the path to the directory by accessing the ``log_dir`` attribute
|
||||||
of the :class:`Result` object returned by ``Trainer.fit``.
|
of the :py:class:`~ray.air.result.Result` object returned by ``Trainer.fit()``.
|
||||||
|
|
||||||
.. code-block:: python
|
.. code-block:: python
|
||||||
|
|
||||||
|
@ -497,7 +499,7 @@ training function. This will cause the checkpoint state from the distributed
|
||||||
workers to be saved on the ``Trainer`` (where your python script is executed).
|
workers to be saved on the ``Trainer`` (where your python script is executed).
|
||||||
|
|
||||||
The latest saved checkpoint can be accessed through the ``checkpoint`` attribute of
|
The latest saved checkpoint can be accessed through the ``checkpoint`` attribute of
|
||||||
the :class:`Result`, and the best saved checkpoints can be accessed by the ``best_checkpoints``
|
the :py:class:`~ray.air.result.Result`, and the best saved checkpoints can be accessed by the ``best_checkpoints``
|
||||||
attribute.
|
attribute.
|
||||||
|
|
||||||
Concrete examples are provided to demonstrate how checkpoints (model weights but not models) are saved
|
Concrete examples are provided to demonstrate how checkpoints (model weights but not models) are saved
|
||||||
|
@ -619,7 +621,7 @@ Configuring checkpoints
|
||||||
+++++++++++++++++++++++
|
+++++++++++++++++++++++
|
||||||
|
|
||||||
For more configurability of checkpointing behavior (specifically saving
|
For more configurability of checkpointing behavior (specifically saving
|
||||||
checkpoints to disk), a :class:`CheckpointConfig` can be passed into
|
checkpoints to disk), a :py:class:`~ray.air.config.CheckpointConfig` can be passed into
|
||||||
``Trainer``.
|
``Trainer``.
|
||||||
|
|
||||||
As an example, to completely disable writing checkpoints to disk:
|
As an example, to completely disable writing checkpoints to disk:
|
||||||
|
@ -684,11 +686,11 @@ Loading checkpoints
|
||||||
|
|
||||||
Checkpoints can be loaded into the training function in 2 steps:
|
Checkpoints can be loaded into the training function in 2 steps:
|
||||||
|
|
||||||
1. From the training function, ``session.get_checkpoint`` can be used to access
|
1. From the training function, :func:`ray.air.session.get_checkpoint` can be used to access
|
||||||
the most recently saved :class:`Checkpoint`. This is useful to continue training even
|
the most recently saved :py:class:`~ray.air.checkpoint.Checkpoint`. This is useful to continue training even
|
||||||
if there's a worker failure.
|
if there's a worker failure.
|
||||||
2. The checkpoint to start training with can be bootstrapped by passing in a
|
2. The checkpoint to start training with can be bootstrapped by passing in a
|
||||||
:class:`Checkpoint` to ``Trainer`` as the ``resume_from_checkpoint`` argument.
|
:py:class:`~ray.air.checkpoint.Checkpoint` to ``Trainer`` as the ``resume_from_checkpoint`` argument.
|
||||||
|
|
||||||
.. tabbed:: PyTorch
|
.. tabbed:: PyTorch
|
||||||
|
|
||||||
|
@ -835,7 +837,7 @@ Callbacks
|
||||||
|
|
||||||
You may want to plug in your training code with your favorite experiment management framework.
|
You may want to plug in your training code with your favorite experiment management framework.
|
||||||
Ray AIR provides an interface to fetch intermediate results and callbacks to process/log your intermediate results
|
Ray AIR provides an interface to fetch intermediate results and callbacks to process/log your intermediate results
|
||||||
(the values passed into ``session.report(...)``).
|
(the values passed into :func:`ray.air.session.report`).
|
||||||
|
|
||||||
Ray AIR contains :ref:`built-in callbacks <air-builtin-callbacks>` for popular tracking frameworks, or you can implement your own callback via the :ref:`Callback <tune-callbacks-docs>` interface.
|
Ray AIR contains :ref:`built-in callbacks <air-builtin-callbacks>` for popular tracking frameworks, or you can implement your own callback via the :ref:`Callback <tune-callbacks-docs>` interface.
|
||||||
|
|
||||||
|
@ -860,7 +862,7 @@ Custom Callbacks
|
||||||
++++++++++++++++
|
++++++++++++++++
|
||||||
|
|
||||||
If the provided callbacks do not cover your desired integrations or use-cases,
|
If the provided callbacks do not cover your desired integrations or use-cases,
|
||||||
you may always implement a custom callback by subclassing ``Callback``. If
|
you may always implement a custom callback by subclassing :py:class:`~ray.tune.logger.LoggerCallback`. If
|
||||||
the callback is general enough, please feel welcome to :ref:`add it <getting-involved>`
|
the callback is general enough, please feel welcome to :ref:`add it <getting-involved>`
|
||||||
to the ``ray`` `repository <https://github.com/ray-project/ray>`_.
|
to the ``ray`` `repository <https://github.com/ray-project/ray>`_.
|
||||||
|
|
||||||
|
@ -1034,7 +1036,7 @@ Hyperparameter tuning (Ray Tune)
|
||||||
|
|
||||||
Hyperparameter tuning with :ref:`Ray Tune <tune-main>` is natively supported
|
Hyperparameter tuning with :ref:`Ray Tune <tune-main>` is natively supported
|
||||||
with Ray Train. Specifically, you can take an existing ``Trainer`` and simply
|
with Ray Train. Specifically, you can take an existing ``Trainer`` and simply
|
||||||
pass it into a :class:`Tuner`.
|
pass it into a :py:class:`~ray.tune.tuner.Tuner`.
|
||||||
|
|
||||||
.. code-block:: python
|
.. code-block:: python
|
||||||
|
|
||||||
|
@ -1076,9 +1078,9 @@ precision datatype for operations like linear layers and convolutions.
|
||||||
|
|
||||||
You can train your Torch model with AMP by:
|
You can train your Torch model with AMP by:
|
||||||
|
|
||||||
1. Adding ``train.torch.accelerate(amp=True)`` to the top of your training function.
|
1. Adding :func:`ray.train.torch.accelerate` with ``amp=True`` to the top of your training function.
|
||||||
2. Wrapping your optimizer with ``train.torch.prepare_optimizer``.
|
2. Wrapping your optimizer with :func:`ray.train.torch.prepare_optimizer`.
|
||||||
3. Replacing your backward call with ``train.torch.backward``.
|
3. Replacing your backward call with :func:`ray.train.torch.backward`.
|
||||||
|
|
||||||
.. code-block:: diff
|
.. code-block:: diff
|
||||||
|
|
||||||
|
@ -1120,7 +1122,7 @@ Reproducibility
|
||||||
.. tabbed:: PyTorch
|
.. tabbed:: PyTorch
|
||||||
|
|
||||||
To limit sources of nondeterministic behavior, add
|
To limit sources of nondeterministic behavior, add
|
||||||
``train.torch.enable_reproducibility()`` to the top of your training
|
:func:`ray.train.torch.enable_reproducibility` to the top of your training
|
||||||
function.
|
function.
|
||||||
|
|
||||||
.. code-block:: diff
|
.. code-block:: diff
|
||||||
|
@ -1133,7 +1135,7 @@ Reproducibility
|
||||||
|
|
||||||
...
|
...
|
||||||
|
|
||||||
.. warning:: ``train.torch.enable_reproducibility`` can't guarantee
|
.. warning:: :func:`ray.train.torch.enable_reproducibility` can't guarantee
|
||||||
completely reproducible results across executions. To learn more, read
|
completely reproducible results across executions. To learn more, read
|
||||||
the `PyTorch notes on randomness <https://pytorch.org/docs/stable/notes/randomness.html>`_.
|
the `PyTorch notes on randomness <https://pytorch.org/docs/stable/notes/randomness.html>`_.
|
||||||
|
|
||||||
|
|
|
@ -143,8 +143,8 @@ class DataParallelTrainer(BaseTrainer):
|
||||||
- **Use Case 1:** You want to do data parallel training, but want to have
|
- **Use Case 1:** You want to do data parallel training, but want to have
|
||||||
a predefined ``training_loop_per_worker``.
|
a predefined ``training_loop_per_worker``.
|
||||||
|
|
||||||
- **Use Case 2:** You want to implement a custom :ref:`Training backend
|
- **Use Case 2:** You want to implement a custom
|
||||||
<train-api-backend-interfaces>` that automatically handles
|
:py:class:`~ray.train.backend.Backend` that automatically handles
|
||||||
additional setup or teardown logic on each actor, so that the users of this
|
additional setup or teardown logic on each actor, so that the users of this
|
||||||
new trainer do not have to implement this logic. For example, a
|
new trainer do not have to implement this logic. For example, a
|
||||||
``TensorflowTrainer`` can be built on top of ``DataParallelTrainer``
|
``TensorflowTrainer`` can be built on top of ``DataParallelTrainer``
|
||||||
|
|
Loading…
Add table
Reference in a new issue