mirror of
https://github.com/vale981/ray
synced 2025-03-06 02:21:39 -05:00
[tune] Clarify Intro Tune Documentation (#8201)
This commit is contained in:
parent
a77e5a8cbf
commit
be5235d982
10 changed files with 184 additions and 141 deletions
|
@ -1,3 +1,5 @@
|
|||
.. _actor-guide:
|
||||
|
||||
Using Actors
|
||||
============
|
||||
|
||||
|
|
|
@ -47,6 +47,8 @@ You can run this `toy PBT example <https://github.com/ray-project/ray/blob/maste
|
|||
.. autoclass:: ray.tune.schedulers.PopulationBasedTraining
|
||||
:noindex:
|
||||
|
||||
.. _tune-scheduler-hyperband:
|
||||
|
||||
Asynchronous HyperBand
|
||||
----------------------
|
||||
|
||||
|
|
|
@ -89,6 +89,8 @@ An example of this can be found in `bayesopt_example.py <https://github.com/ray-
|
|||
:show-inheritance:
|
||||
:noindex:
|
||||
|
||||
.. _tune-hyperopt:
|
||||
|
||||
HyperOpt Search (Tree-structured Parzen Estimators)
|
||||
---------------------------------------------------
|
||||
|
||||
|
@ -112,6 +114,7 @@ An example of this can be found in `hyperopt_example.py <https://github.com/ray-
|
|||
:show-inheritance:
|
||||
:noindex:
|
||||
|
||||
|
||||
SigOpt Search
|
||||
-------------
|
||||
|
||||
|
@ -141,6 +144,8 @@ An example of this can be found in `sigopt_example.py <https://github.com/ray-pr
|
|||
:show-inheritance:
|
||||
:noindex:
|
||||
|
||||
.. _tune-nevergrad:
|
||||
|
||||
Nevergrad Search
|
||||
----------------
|
||||
|
||||
|
@ -217,6 +222,8 @@ An example of this can be found in `dragonfly_example.py <https://github.com/ray
|
|||
:show-inheritance:
|
||||
:noindex:
|
||||
|
||||
.. _tune-ax:
|
||||
|
||||
Ax Search
|
||||
---------
|
||||
|
||||
|
|
|
@ -10,7 +10,7 @@ Tune: Scalable Hyperparameter Tuning
|
|||
Tune is a Python library for experiment execution and hyperparameter tuning at any scale. Core features:
|
||||
|
||||
* Launch a multi-node :ref:`distributed hyperparameter sweep <tune-distributed>` in less than 10 lines of code.
|
||||
* Supports any machine learning framework, including PyTorch, XGBoost, MXNet, and Keras. See :ref:`examples here <tune-guides-overview>`.
|
||||
* Supports any machine learning framework, :ref:`including PyTorch, XGBoost, MXNet, and Keras<tune-guides-overview>`.
|
||||
* Natively `integrates with optimization libraries <tune-searchalg.html>`_ such as `HyperOpt <https://github.com/hyperopt/hyperopt>`_, `Bayesian Optimization <https://github.com/fmfn/BayesianOptimization>`_, and `Facebook Ax <http://ax.dev>`_.
|
||||
* Choose among `scalable algorithms <tune-schedulers.html>`_ such as `Population Based Training (PBT)`_, `Vizier's Median Stopping Rule`_, `HyperBand/ASHA`_.
|
||||
* Visualize results with `TensorBoard <https://www.tensorflow.org/get_started/summaries_and_tensorboard>`__.
|
||||
|
@ -19,14 +19,7 @@ Tune is a Python library for experiment execution and hyperparameter tuning at a
|
|||
.. _`Vizier's Median Stopping Rule`: tune-schedulers.html#median-stopping-rule
|
||||
.. _`HyperBand/ASHA`: tune-schedulers.html#asynchronous-hyperband
|
||||
|
||||
.. important:: Join our `community slack <https://forms.gle/9TSdDYUgxYs8SA9e8>`_ to discuss Ray!
|
||||
|
||||
For more information, check out:
|
||||
|
||||
* :ref:`Tune in 60 Seconds <tune-60-seconds>`: A quick overview of Tune and its key concepts.
|
||||
* :ref:`Tune Guides and Examples <tune-guides-overview>`: Examples, Tutorials, and Guides for how to use Tune.
|
||||
* `Code <https://github.com/ray-project/ray/tree/master/python/ray/tune>`__: GitHub repository for Tune.
|
||||
|
||||
**Want to get started?** Head over to the :ref:`60 second Tune tutorial <tune-60-seconds>`.
|
||||
|
||||
Quick Start
|
||||
-----------
|
||||
|
@ -57,14 +50,16 @@ If using TF2 and TensorBoard, Tune will also automatically generate TensorBoard
|
|||
:scale: 20%
|
||||
:align: center
|
||||
|
||||
Take a look at the :ref:`Distributed Experiments <tune-distributed>` documentation for:
|
||||
|
||||
1. Setting up distributed experiments on your local cluster
|
||||
2. Using AWS and GCP
|
||||
3. Spot instance usage/pre-emptible instances, and more.
|
||||
.. tip:: Join the `Ray community slack <https://forms.gle/9TSdDYUgxYs8SA9e8>`_ to discuss Ray Tune (and other Ray libraries)!
|
||||
|
||||
Talks and Blogs
|
||||
---------------
|
||||
Guides/Materials
|
||||
----------------
|
||||
|
||||
Here are some reference materials for Tune:
|
||||
|
||||
* :ref:`Tune Tutorials, Guides, and Examples <tune-guides-overview>`
|
||||
* `Code <https://github.com/ray-project/ray/tree/master/python/ray/tune>`__: GitHub repository for Tune
|
||||
|
||||
Below are some blog posts and talks about Tune:
|
||||
|
||||
|
|
|
@ -16,9 +16,9 @@ Take a look at any of the below tutorials to get started with Tune.
|
|||
<div class="sphx-glr-bigcontainer">
|
||||
|
||||
.. customgalleryitem::
|
||||
:tooltip: A gentle 60 second tour of core Tune concepts.
|
||||
:tooltip: Tune concepts in 60 seconds.
|
||||
:figure: /images/tune-workflow.png
|
||||
:description: :doc:`A gentle 60 second tour of Tune <tune-60-seconds>`
|
||||
:description: :doc:`Tune concepts in 60 seconds <tune-60-seconds>`
|
||||
|
||||
.. customgalleryitem::
|
||||
:tooltip: A simple Tune walkthrough.
|
||||
|
@ -124,6 +124,7 @@ Tune Examples
|
|||
|
||||
If any example is broken, or if you'd like to add an example to this page, feel free to raise an issue on our Github repository.
|
||||
|
||||
.. _tune-general-examples:
|
||||
|
||||
General Examples
|
||||
~~~~~~~~~~~~~~~~
|
||||
|
|
|
@ -9,156 +9,167 @@ Let's quickly walk through the key concepts you need to know to use Tune. In thi
|
|||
:local:
|
||||
:depth: 1
|
||||
|
||||
Tune takes a user-defined Python function or class and evaluates it on a set of hyperparameter configurations. Each hyperparameter configuration evaluation is called a *trial*, and Tune runs multiple trials in parallel, leveraging Search Algorithms and Trial Schedulers to optimize your hyperparameters.
|
||||
|
||||
.. image:: /images/tune-workflow.png
|
||||
|
||||
Trainables
|
||||
----------
|
||||
|
||||
To allow Tune to optimize your model, Tune will need to control your training process. This is done via the Trainable API. Each *trial* corresponds to one instance of a Trainable; Tune will create multiple instances of the Trainable.
|
||||
Tune will optimize your training process using the :ref:`Trainable API <trainable-docs>`. To start, let's try to maximize this objective function:
|
||||
|
||||
The Trainable API is where you specify how to set up your model and track intermediate training progress. There are two types of Trainables - a **function-based API** is for fast prototyping, and **class-based** API that unlocks many Tune features such as checkpointing, pausing.
|
||||
.. code-block:: python
|
||||
|
||||
def objective(x, a, b):
|
||||
return a * (x ** 0.5) + b
|
||||
|
||||
Here's an example of specifying the objective function using :ref:`the function-based Trainable API <tune-function-api>`:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
def trainable(config):
|
||||
# config (dict): A dict of hyperparameters.
|
||||
|
||||
for x in range(20):
|
||||
score = objective(x, config["a"], config["b"])
|
||||
|
||||
tune.track.log(score=score) # This sends the score to Tune.
|
||||
|
||||
Now, there's two Trainable APIs - one being the :ref:`function-based API <tune-function-api>` that we demonstrated above.
|
||||
|
||||
The other is a :ref:`class-based API <tune-class-api>` that enables :ref:`checkpointing and pausing <tune-trainable-save-restore>`. Here's an example of specifying the objective function using the :ref:`class-based API <tune-class-api>`:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
from ray import tune
|
||||
|
||||
class Trainable(tune.Trainable):
|
||||
"""Tries to iteratively find the password."""
|
||||
|
||||
def _setup(self, config):
|
||||
self.iter = 0
|
||||
self.password = 1024
|
||||
# config (dict): A dict of hyperparameters
|
||||
self.x = 0
|
||||
self.a = config["a"]
|
||||
self.b = config["b"]
|
||||
|
||||
def _train(self):
|
||||
"""Execute one step of 'training'. This function will be called iteratively"""
|
||||
self.iter += 1
|
||||
return {
|
||||
"accuracy": abs(self.iter - self.password),
|
||||
"training_iteration": self.iter # Tune will automatically provide this.
|
||||
}
|
||||
|
||||
def _stop(self):
|
||||
# perform any cleanup necessary.
|
||||
pass
|
||||
|
||||
Function API example:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
def trainable(config):
|
||||
"""
|
||||
Args:
|
||||
config (dict): Parameters provided from the search algorithm
|
||||
or variant generation.
|
||||
"""
|
||||
|
||||
while True:
|
||||
# ...
|
||||
tune.track.log(**kwargs)
|
||||
def _train(self): # This is called iteratively.
|
||||
score = objective(self.x, self.a, self.b)
|
||||
self.x += 1
|
||||
return {"score": score}
|
||||
|
||||
.. tip:: Do not use ``tune.track.log`` within a ``Trainable`` class.
|
||||
|
||||
See the documentation: :ref:`trainable-docs`.
|
||||
See the documentation: :ref:`trainable-docs` and :ref:`examples <tune-general-examples>`.
|
||||
|
||||
tune.run
|
||||
--------
|
||||
|
||||
Use ``tune.run`` execute hyperparameter tuning using the core Ray APIs. This function manages your distributed experiment and provides many features such as logging, checkpointing, and early stopping.
|
||||
Use ``tune.run`` execute hyperparameter tuning using the core Ray APIs. This function manages your experiment and provides many features such as :ref:`logging <tune-logging>`, :ref:`checkpointing <tune-checkpoint>`, and :ref:`early stopping <tune-stopping>`.
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
# Pass in a Trainable class or function to tune.run.
|
||||
tune.run(trainable)
|
||||
|
||||
# Run 10 trials (each trial is one instance of a Trainable). Tune runs in
|
||||
# parallel and automatically determines concurrency.
|
||||
tune.run(trainable, num_samples=10)
|
||||
|
||||
# Run 1 trial, stop when trial has reached 10 iterations OR a mean accuracy of 0.98.
|
||||
tune.run(my_trainable, stop={"training_iteration": 10, "mean_accuracy": 0.98})
|
||||
|
||||
# Run 1 trial, search over hyperparameters, stop after 10 iterations.
|
||||
hyperparameters = {"lr": tune.uniform(0, 1), "momentum": tune.uniform(0, 1)}
|
||||
tune.run(my_trainable, config=hyperparameters, stop={"training_iteration": 10})
|
||||
|
||||
This function will report status on the command line until all Trials stop:
|
||||
This function will report status on the command line until all trials stop (each trial is one instance of a :ref:`Trainable <trainable-docs>`):
|
||||
|
||||
.. code-block:: bash
|
||||
|
||||
== Status ==
|
||||
Memory usage on this node: 11.4/16.0 GiB
|
||||
Using FIFO scheduling algorithm.
|
||||
Resources requested: 4/12 CPUs, 0/0 GPUs, 0.0/3.17 GiB heap, 0.0/1.07 GiB objects
|
||||
Resources requested: 1/12 CPUs, 0/0 GPUs, 0.0/3.17 GiB heap, 0.0/1.07 GiB objects
|
||||
Result logdir: /Users/foo/ray_results/myexp
|
||||
Number of trials: 4 (4 RUNNING)
|
||||
Number of trials: 1 (1 RUNNING)
|
||||
+----------------------+----------+---------------------+-----------+--------+--------+----------------+-------+
|
||||
| Trial name | status | loc | param1 | param2 | acc | total time (s) | iter |
|
||||
| Trial name | status | loc | a | b | score | total time (s) | iter |
|
||||
|----------------------+----------+---------------------+-----------+--------+--------+----------------+-------|
|
||||
| MyTrainable_a826033a | RUNNING | 10.234.98.164:31115 | 0.303706 | 0.0761 | 0.1289 | 7.54952 | 15 |
|
||||
| MyTrainable_a8263fc6 | RUNNING | 10.234.98.164:31117 | 0.929276 | 0.158 | 0.4865 | 7.0501 | 14 |
|
||||
| MyTrainable_a8267914 | RUNNING | 10.234.98.164:31111 | 0.068426 | 0.0319 | 0.9585 | 7.0477 | 14 |
|
||||
| MyTrainable_a826b7bc | RUNNING | 10.234.98.164:31112 | 0.729127 | 0.0748 | 0.1797 | 7.05715 | 14 |
|
||||
+----------------------+----------+---------------------+-----------+--------+--------+----------------+-------+
|
||||
|
||||
See the documentation: :ref:`tune-run-ref`.
|
||||
|
||||
You can also easily run 10 trials. Tune automatically :ref:`determines how many trials will run in parallel <tune-parallelism>`.
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
tune.run(trainable, num_samples=10)
|
||||
|
||||
Finally, you can randomly sample or grid search hyperparameters via Tune's :ref:`search space API <tune-default-search-space>`:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
space = {"x": tune.uniform(0, 1)}
|
||||
tune.run(my_trainable, config=space, num_samples=10)
|
||||
|
||||
See more documentation: :ref:`tune-run-ref`.
|
||||
|
||||
|
||||
Search Algorithms
|
||||
-----------------
|
||||
|
||||
To optimize the hyperparameters of your training process, you will want to explore a “search space”.
|
||||
|
||||
Search Algorithms are Tune modules that help explore a provided search space. It will use previous results from evaluating different hyperparameters to suggest better hyperparameters. Tune has SearchAlgorithms that integrate with many popular **optimization** libraries, such as `Nevergrad <https://github.com/facebookresearch/nevergrad>`_ and `Hyperopt <https://github.com/hyperopt/hyperopt/>`_.
|
||||
To optimize the hyperparameters of your training process, you will want to use a :ref:`Search Algorithm <tune-search-alg>` which will help suggest better hyperparameters.
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
# https://github.com/hyperopt/hyperopt/
|
||||
# pip install hyperopt
|
||||
# Be sure to first run `pip install hyperopt`
|
||||
|
||||
import hyperopt as hp
|
||||
from ray.tune.suggest.hyperopt import HyperOptSearch
|
||||
|
||||
# Create a HyperOpt search space
|
||||
space = {"momentum": hp.uniform("momentum", 0, 20), "lr": hp.uniform("lr", 0, 1)}
|
||||
# Pass the search space into Tune's HyperOpt wrapper and maximize accuracy
|
||||
hyperopt = HyperOptSearch(space, metric="accuracy", mode="max")
|
||||
space = {
|
||||
"a": hp.uniform("a", 0, 1),
|
||||
"b": hp.uniform("b", 0, 20)
|
||||
|
||||
# Execute 20 trials using HyperOpt, stop after 20 iterations
|
||||
max_iters = {"training_iteration": 20}
|
||||
tune.run(trainable, search_alg=hyperopt, num_samples=20, stop=max_iters)
|
||||
# Note: Arbitrary HyperOpt search spaces should be supported!
|
||||
# "foo": hp.lognormal("foo", 0, 1))
|
||||
}
|
||||
|
||||
# Specify the search space and maximize score
|
||||
hyperopt = HyperOptSearch(space, metric="score", mode="max")
|
||||
|
||||
# Execute 20 trials using HyperOpt and stop after 20 iterations
|
||||
tune.run(
|
||||
trainable,
|
||||
search_alg=hyperopt,
|
||||
num_samples=20,
|
||||
stop={"training_iteration": 20}
|
||||
)
|
||||
|
||||
Tune has SearchAlgorithms that integrate with many popular **optimization** libraries, such as :ref:`Nevergrad <tune-nevergrad>` and :ref:`Hyperopt <tune-hyperopt>`.
|
||||
|
||||
See the documentation: :ref:`searchalg-ref`.
|
||||
|
||||
Trial Schedulers
|
||||
----------------
|
||||
|
||||
In addition, you can make your training process more efficient by stopping, pausing, or changing the hyperparameters of running trials.
|
||||
In addition, you can make your training process more efficient by using a :ref:`Trial Scheduler <tune-schedulers>`.
|
||||
|
||||
Trial Schedulers are Tune modules that adjust and change distributed training runs during execution. These modules can stop/pause/tweak the hyperparameters of running trials, making your hyperparameter tuning process much faster. Population-based training and HyperBand are examples of popular optimization algorithms implemented as Trial Schedulers.
|
||||
Trial Schedulers can stop/pause/tweak the hyperparameters of running trials, making your hyperparameter tuning process much faster.
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
from ray.tune.schedulers import HyperBandScheduler
|
||||
|
||||
# Create HyperBand scheduler and maximize accuracy
|
||||
hyperband = HyperBandScheduler(metric="accuracy", mode="max")
|
||||
# Create HyperBand scheduler and maximize score
|
||||
hyperband = HyperBandScheduler(metric="score", mode="max")
|
||||
|
||||
# Execute 20 trials using HyperBand using a search space
|
||||
configs = {"lr": tune.uniform(0, 1), "momentum": tune.uniform(0, 1)}
|
||||
tune.run(MyTrainableClass, num_samples=20, config=configs, scheduler=hyperband)
|
||||
configs = {"a": tune.uniform(0, 1), "b": tune.uniform(0, 1)}
|
||||
|
||||
Unlike **Search Algorithms**, Trial Schedulers do not select which hyperparameter configurations to evaluate. However, you can use them together.
|
||||
tune.run(
|
||||
MyTrainableClass,
|
||||
config=configs,
|
||||
num_samples=20,
|
||||
scheduler=hyperband
|
||||
)
|
||||
|
||||
:ref:`Population-based Training <tune-scheduler-pbt>` and :ref:`HyperBand <tune-scheduler-hyperband>` are examples of popular optimization algorithms implemented as Trial Schedulers.
|
||||
|
||||
Unlike **Search Algorithms**, :ref:`Trial Scheduler <tune-schedulers>` do not select which hyperparameter configurations to evaluate. However, you can use them together.
|
||||
|
||||
See the documentation: :ref:`schedulers-ref`.
|
||||
|
||||
|
||||
Analysis
|
||||
--------
|
||||
|
||||
After running a hyperparameter tuning job, you will want to analyze your results to determine what specific parameters are important and which hyperparameter values are the best.
|
||||
|
||||
``tune.run`` returns an :ref:`Analysis <tune-analysis-docs>` object which has methods you can use for analyzing your results. This object can also retrieve all training runs as dataframes, allowing you to do ad-hoc data analysis over your results.
|
||||
``tune.run`` returns an :ref:`Analysis <tune-analysis-docs>` object which has methods you can use for analyzing your training.
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
|
@ -167,13 +178,16 @@ After running a hyperparameter tuning job, you will want to analyze your results
|
|||
# Get the best hyperparameters
|
||||
best_hyperparameters = analysis.get_best_config()
|
||||
|
||||
# Get a dataframe for the max accuracy seen for each trial
|
||||
df = analysis.dataframe(metric="mean_accuracy", mode="max")
|
||||
This object can also retrieve all training runs as dataframes, allowing you to do ad-hoc data analysis over your results.
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
# Get a dataframe for the max score seen for each trial
|
||||
df = analysis.dataframe(metric="score", mode="max")
|
||||
|
||||
What's Next?
|
||||
~~~~~~~~~~~~
|
||||
|
||||
|
||||
Now that you have a working understanding of Tune, check out:
|
||||
|
||||
* :ref:`Tune Guides and Examples <tune-guides-overview>`: Examples and templates for using Tune with your preferred machine learning library.
|
||||
|
|
|
@ -5,7 +5,7 @@ A Basic Tune Tutorial
|
|||
|
||||
.. image:: /images/tune-api.svg
|
||||
|
||||
This tutorial will walk you through the following process to setup a Tune experiment. Specifically, we'll leverage ASHA and Bayesian Optimization (via HyperOpt) via the following steps:
|
||||
This tutorial will walk you through the following process to setup a Tune experiment using Pytorch. Specifically, we'll leverage ASHA and Bayesian Optimization (via HyperOpt) via the following steps:
|
||||
|
||||
1. Integrating Tune into your workflow
|
||||
2. Specifying a TrialScheduler
|
||||
|
|
|
@ -9,6 +9,8 @@ This document provides an overview of the core concepts as well as some of the c
|
|||
|
||||
.. contents:: :local:
|
||||
|
||||
.. _tune-parallelism:
|
||||
|
||||
Parallelism / GPUs
|
||||
------------------
|
||||
|
||||
|
@ -60,6 +62,8 @@ To attach to a Ray cluster, simply run ``ray.init`` before ``tune.run``:
|
|||
ray.init(address=<ray_address>)
|
||||
tune.run(trainable, num_samples=100, resources_per_trial={"cpu": 2, "gpu": 1})
|
||||
|
||||
.. _tune-default-search-space:
|
||||
|
||||
Search Space (Grid/Random)
|
||||
--------------------------
|
||||
|
||||
|
@ -219,6 +223,8 @@ You often will want to compute a large object (e.g., training data, model weight
|
|||
|
||||
tune.run(f)
|
||||
|
||||
.. _tune-stopping:
|
||||
|
||||
Stopping Trials
|
||||
---------------
|
||||
|
||||
|
@ -271,6 +277,8 @@ Finally, you can implement the ``Stopper`` abstract class for stopping entire ex
|
|||
|
||||
Note that in the above example the currently running trials will not stop immediately but will do so once their current iterations are complete. See the :ref:`tune-stop-ref` documentation.
|
||||
|
||||
.. _tune-logging:
|
||||
|
||||
Logging/Tensorboard
|
||||
-------------------
|
||||
|
||||
|
|
|
@ -7,30 +7,47 @@ Training can be done with either a **Class API** (``tune.Trainable``) or **funct
|
|||
|
||||
You can use the **function-based API** for fast prototyping. On the other hand, the ``tune.Trainable`` interface supports checkpoint/restore functionality and provides more control for advanced algorithms.
|
||||
|
||||
For the sake of example, let's maximize this objective function:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
def objective(x, a, b):
|
||||
return a * (x ** 0.5) + b
|
||||
|
||||
.. _tune-function-api:
|
||||
|
||||
Function-based API
|
||||
------------------
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
def trainable(config):
|
||||
"""
|
||||
Args:
|
||||
config (dict): Parameters provided from the search algorithm
|
||||
or variant generation.
|
||||
"""
|
||||
# config (dict): A dict of hyperparameters.
|
||||
|
||||
while True:
|
||||
# ...
|
||||
tune.track.log(**kwargs)
|
||||
for x in range(20):
|
||||
score = objective(x, config["a"], config["b"])
|
||||
|
||||
tune.track.log(score=score) # This sends the score to Tune.
|
||||
|
||||
analysis = tune.run(
|
||||
trainable,
|
||||
config={
|
||||
"a": 2,
|
||||
"b": 4
|
||||
})
|
||||
|
||||
print("best config: ", analysis.get_best_config(metric="score", mode="max"))
|
||||
|
||||
.. tip:: Do not use ``tune.track.log`` within a ``Trainable`` class.
|
||||
|
||||
Tune will run this function on a separate thread in a Ray actor process. Note that this API is not checkpointable, since the thread will never return control back to its caller.
|
||||
|
||||
.. note:: If you have a lambda function that you want to train, you will need to first register the function: ``tune.register_trainable("lambda_id", lambda x: ...)``. You can then use ``lambda_id`` in place of ``my_trainable``.
|
||||
.. note:: If you want to pass in a Python lambda, you will need to first register the function: ``tune.register_trainable("lambda_id", lambda x: ...)``. You can then use ``lambda_id`` in place of ``my_trainable``.
|
||||
|
||||
Trainable API
|
||||
-------------
|
||||
.. _tune-class-api:
|
||||
|
||||
Trainable Class API
|
||||
-------------------
|
||||
|
||||
.. caution:: Do not use ``tune.track.log`` within a ``Trainable`` class.
|
||||
|
||||
|
@ -40,44 +57,40 @@ The Trainable **class API** will require users to subclass ``ray.tune.Trainable`
|
|||
|
||||
from ray import tune
|
||||
|
||||
class Guesser(tune.Trainable):
|
||||
"""Randomly picks a number from [1, 10000) to find the password."""
|
||||
|
||||
class Trainable(tune.Trainable):
|
||||
def _setup(self, config):
|
||||
self.guess = config["guess"]
|
||||
self.iter = 0
|
||||
self.password = 1024
|
||||
|
||||
def _train(self):
|
||||
"""Execute one step of 'training'. This function will be called iteratively"""
|
||||
self.iter += 1
|
||||
self.guess += 1
|
||||
return {
|
||||
"accuracy": abs(self.guess - self.password),
|
||||
"training_iteration": self.iter # Tune will automatically provide this.
|
||||
}
|
||||
# config (dict): A dict of hyperparameters
|
||||
self.x = 0
|
||||
self.a = config["a"]
|
||||
self.b = config["b"]
|
||||
|
||||
def _train(self): # This is called iteratively.
|
||||
score = objective(self.x, self.a, self.b)
|
||||
self.x += 1
|
||||
return {"score": score}
|
||||
|
||||
analysis = tune.run(
|
||||
Guesser,
|
||||
stop={"training_iteration": 10},
|
||||
num_samples=10,
|
||||
Trainable,
|
||||
stop={"training_iteration": 20},
|
||||
config={
|
||||
"guess": tune.randint(1, 10000)
|
||||
"a": 2,
|
||||
"b": 4
|
||||
})
|
||||
|
||||
print('best config: ', analysis.get_best_config(metric="diff", mode="min"))
|
||||
print('best config: ', analysis.get_best_config(metric="score", mode="max"))
|
||||
|
||||
As a subclass of ``tune.Trainable``, Tune will create a ``Guesser`` object on a separate process (using the Ray Actor API).
|
||||
As a subclass of ``tune.Trainable``, Tune will create a ``Trainable`` object on a separate process (using the :ref:`Ray Actor API <actor-guide>`).
|
||||
|
||||
1. ``_setup`` function is invoked once training starts.
|
||||
2. ``_train`` is invoked **multiple times**. Each time, the Guesser object executes one logical iteration of training in the tuning process, which may include one or more iterations of actual training.
|
||||
2. ``_train`` is invoked **multiple times**. Each time, the Trainable object executes one logical iteration of training in the tuning process, which may include one or more iterations of actual training.
|
||||
3. ``_stop`` is invoked when training is finished.
|
||||
|
||||
.. tip:: As a rule of thumb, the execution time of ``_train`` should be large enough to avoid overheads (i.e. more than a few seconds), but short enough to report progress periodically (i.e. at most a few minutes).
|
||||
|
||||
In this example, we only implemented the ``_setup`` and ``_train`` methods for simplification. Next, we'll implement ``_save`` and ``_restore`` for checkpoint and fault tolerance.
|
||||
|
||||
.. _tune-trainable-save-restore:
|
||||
|
||||
Save and Restore
|
||||
~~~~~~~~~~~~~~~~
|
||||
|
||||
|
|
|
@ -219,18 +219,19 @@ def run(run_or_experiment,
|
|||
TuneError: Any trials failed and `raise_on_failed_trial` is True.
|
||||
|
||||
Examples:
|
||||
>>> tune.run(mytrainable, scheduler=PopulationBasedTraining())
|
||||
|
||||
>>> tune.run(mytrainable, num_samples=5, reuse_actors=True)
|
||||
.. code-block:: python
|
||||
|
||||
>>> tune.run(
|
||||
>>> "PG",
|
||||
>>> num_samples=5,
|
||||
>>> config={
|
||||
>>> "env": "CartPole-v0",
|
||||
>>> "lr": tune.sample_from(lambda _: np.random.rand())
|
||||
>>> }
|
||||
>>> )
|
||||
# Run 10 trials (each trial is one instance of a Trainable). Tune runs
|
||||
# in parallel and automatically determines concurrency.
|
||||
tune.run(trainable, num_samples=10)
|
||||
|
||||
# Run 1 trial, stop when trial has reached 10 iterations
|
||||
tune.run(my_trainable, stop={"training_iteration": 10})
|
||||
|
||||
# Run 1 trial, search over hyperparameters, stop after 10 iterations.
|
||||
space = {"lr": tune.uniform(0, 1), "momentum": tune.uniform(0, 1)}
|
||||
tune.run(my_trainable, config=space, stop={"training_iteration": 10})
|
||||
"""
|
||||
trial_executor = trial_executor or RayTrialExecutor(
|
||||
queue_trials=queue_trials,
|
||||
|
|
Loading…
Add table
Reference in a new issue