ray/doc/source/pbt.rst

Population Based Training
=========================

Ray Tune includes a distributed implementation of `Population Based Training (PBT) <https://deepmind.com/blog/population-based-training-neural-networks>`__.


PBT Scheduler
-------------

Ray Tune's PBT scheduler can be plugged in on top of an existing grid or random search experiment. This can be enabled by setting the ``scheduler`` parameter of ``run_experiments``, e.g.

.. code-block:: python

    run_experiments(
        {...},
        scheduler=PopulationBasedTraining(
            time_attr='time_total_s',
            reward_attr='mean_accuracy',
            perturbation_interval=600.0,
            hyperparam_mutations={
                "lr": [1e-3, 5e-4, 1e-4, 5e-5, 1e-5],
                "alpha": lambda: random.uniform(0.0, 1.0),
                ...
            }))

When the PBT scheduler is enabled, each trial variant is treated as a member of the population. Periodically, top-performing trials are checkpointed (this requires your Trainable to support `checkpointing <tune.html#trial-checkpointing>`__). Low-performing trials clone the checkpoints of top performers and perturb the configurations in the hope of discovering an even better variation.

You can run this `toy PBT example <https://github.com/ray-project/ray/blob/master/python/ray/tune/examples/pbt_example.py>`__ to get an idea of how how PBT operates. When training in PBT mode, a single trial may see many different hyperparameters over its lifetime, which is recorded in its ``result.json`` file. The following figure generated by the example shows PBT discovering new hyperparams over the course of a single experiment:

.. image:: pbt.png

.. autoclass:: ray.tune.pbt.PopulationBasedTraining
[docs] update to expose libraries + landing page (#1642) 2018-03-08 09:18:09 -08:00			`Population Based Training`
			`=========================`

			Ray Tune includes a distributed implementation of `Population Based Training (PBT) <https://deepmind.com/blog/population-based-training-neural-networks>`__.


			`PBT Scheduler`
			`-------------`

			Ray Tune's PBT scheduler can be plugged in on top of an existing grid or random search experiment. This can be enabled by setting the ``scheduler`` parameter of ``run_experiments``, e.g.

			`.. code-block:: python`

			`run_experiments(`
			`{...},`
			`scheduler=PopulationBasedTraining(`
			`time_attr='time_total_s',`
			`reward_attr='mean_accuracy',`
			`perturbation_interval=600.0,`
Fixed attribute name in code example (#2054) hyperparam_mutations 2018-05-14 04:05:06 -04:00			`hyperparam_mutations={`
[docs] update to expose libraries + landing page (#1642) 2018-03-08 09:18:09 -08:00			`"lr": [1e-3, 5e-4, 1e-4, 5e-5, 1e-5],`
			`"alpha": lambda: random.uniform(0.0, 1.0),`
			`...`
			`}))`

			When the PBT scheduler is enabled, each trial variant is treated as a member of the population. Periodically, top-performing trials are checkpointed (this requires your Trainable to support `checkpointing <tune.html#trial-checkpointing>`__). Low-performing trials clone the checkpoints of top performers and perturb the configurations in the hope of discovering an even better variation.

			You can run this `toy PBT example <https://github.com/ray-project/ray/blob/master/python/ray/tune/examples/pbt_example.py>`__ to get an idea of how how PBT operates. When training in PBT mode, a single trial may see many different hyperparameters over its lifetime, which is recorded in its ``result.json`` file. The following figure generated by the example shows PBT discovering new hyperparams over the course of a single experiment:

			`.. image:: pbt.png`

			`.. autoclass:: ray.tune.pbt.PopulationBasedTraining`