mirror of
https://github.com/vale981/ray
synced 2025-03-10 21:36:39 -04:00
More replacements of tune.run() in examples/docstrings for Tuner.fit() Signed-off-by: xwjiang2010 <xwjiang2010@gmail.com> Co-authored-by: Kai Fricke <kai@anyscale.com> Co-authored-by: Kai Fricke <kai@anyscale.com>
This commit is contained in:
parent
1a62a8f855
commit
18ec3afdc6
9 changed files with 55 additions and 127 deletions
|
@ -91,7 +91,7 @@ which implements the proximal policy optimization algorithm in RLlib.
|
|||
# Train via Ray Tune.
|
||||
# Note that Ray Tune does not yet support AlgorithmConfig objects, hence
|
||||
# we need to convert back to old-style config dicts.
|
||||
tune.run(PPO, param_space=config.to_dict())
|
||||
tune.Tuner(PPO, param_space=config.to_dict()).fit()
|
||||
|
||||
|
||||
.. tabbed:: RLlib Command Line
|
||||
|
|
|
@ -159,7 +159,7 @@ We can create an `Algorithm <#algorithms>`__ and try running this policy on a to
|
|||
return MyTFPolicy
|
||||
|
||||
ray.init()
|
||||
tune.run(MyAlgo, config={"env": "CartPole-v0", "num_workers": 2})
|
||||
tune.Tuner(MyAlgo, param_space={"env": "CartPole-v0", "num_workers": 2}).fit()
|
||||
|
||||
|
||||
If you run the above snippet `(runnable file here) <https://github.com/ray-project/ray/blob/master/rllib/examples/custom_tf_policy.py>`__, you'll probably notice that CartPole doesn't learn so well:
|
||||
|
|
|
@ -140,7 +140,7 @@ Serving and Offline
|
|||
- `Saving experiences <https://github.com/ray-project/ray/blob/master/rllib/examples/saving_experiences.py>`__:
|
||||
Example of how to externally generate experience batches in RLlib-compatible format.
|
||||
- `Finding a checkpoint using custom criteria <https://github.com/ray-project/ray/blob/master/rllib/examples/checkpoint_by_custom_criteria.py>`__:
|
||||
Example of how to find a checkpoint after a `tune.run` via some custom defined criteria.
|
||||
Example of how to find a checkpoint after a `Tuner.fit()` via some custom defined criteria.
|
||||
|
||||
|
||||
Multi-Agent and Hierarchical
|
||||
|
|
|
@ -755,19 +755,19 @@ All RLlib algorithms are compatible with the :ref:`Tune API <tune-60-seconds>`.
|
|||
.. code-block:: python
|
||||
|
||||
import ray
|
||||
from ray import tune
|
||||
from ray import air, tune
|
||||
|
||||
ray.init()
|
||||
tune.run(
|
||||
tune.Tuner(
|
||||
"PPO",
|
||||
stop={"episode_reward_mean": 200},
|
||||
config={
|
||||
run_config=air.RunConfig(stop={"episode_reward_mean": 200},),
|
||||
param_space={
|
||||
"env": "CartPole-v0",
|
||||
"num_gpus": 0,
|
||||
"num_workers": 1,
|
||||
"lr": tune.grid_search([0.01, 0.001, 0.0001]),
|
||||
},
|
||||
)
|
||||
).fit()
|
||||
|
||||
Tune will schedule the trials to run in parallel on your Ray cluster:
|
||||
|
||||
|
@ -783,19 +783,21 @@ Tune will schedule the trials to run in parallel on your Ray cluster:
|
|||
- PPO_CartPole-v0_0_lr=0.01: RUNNING [pid=21940], 16 s, 4013 ts, 22 rew
|
||||
- PPO_CartPole-v0_1_lr=0.001: RUNNING [pid=21942], 27 s, 8111 ts, 54.7 rew
|
||||
|
||||
``tune.run()`` returns an ExperimentAnalysis object that allows further analysis of the training results and retrieving the checkpoint(s) of the trained agent.
|
||||
``Tuner.fit()`` returns an ``ResultGrid`` object that allows further analysis of the training results and retrieving the checkpoint(s) of the trained agent.
|
||||
It also simplifies saving the trained agent. For example:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
# tune.run() allows setting a custom log directory (other than ``~/ray-results``)
|
||||
# ``Tuner.fit()`` allows setting a custom log directory (other than ``~/ray-results``)
|
||||
# and automatically saving the trained agent
|
||||
analysis = ray.tune.run(
|
||||
results = ray.tune.Tuner(
|
||||
ppo.PPO,
|
||||
config=config,
|
||||
local_dir=log_dir,
|
||||
stop=stop_criteria,
|
||||
checkpoint_at_end=True)
|
||||
param_space=config,
|
||||
run_config=air.RunConfig(
|
||||
local_dir=log_dir,
|
||||
stop=stop_criteria,
|
||||
checkpoint_config=air.CheckpointConfig(checkpoint_at_end=True),
|
||||
)).fit()
|
||||
|
||||
# list of lists: one list per checkpoint; each checkpoint list contains
|
||||
# 1st the path, 2nd the metric value
|
||||
|
@ -1352,7 +1354,7 @@ which receives the last training results and returns a new task for the env to b
|
|||
"env": MyEnv,
|
||||
"env_task_fn": curriculum_fn,
|
||||
}
|
||||
# Train using `tune.run` or `Algorithm.train()` and the above config stub.
|
||||
# Train using `Tuner.fit()` or `Algorithm.train()` and the above config stub.
|
||||
# ...
|
||||
|
||||
There are two more ways to use the RLlib's other APIs to implement `curriculum learning <https://bair.berkeley.edu/blog/2017/12/20/reverse-curriculum/>`__.
|
||||
|
@ -1386,16 +1388,15 @@ customizations to your training loop.
|
|||
num_workers = 2
|
||||
|
||||
ray.init()
|
||||
tune.run(
|
||||
train,
|
||||
config={
|
||||
tune.Tuner(
|
||||
tune.with_resources(train, resources=tune.PlacementGroupFactory(
|
||||
[{"CPU": 1}, {"GPU": num_gpus}] + [{"CPU": 1}] * num_workers
|
||||
),)
|
||||
param_space={
|
||||
"num_gpus": num_gpus,
|
||||
"num_workers": num_workers,
|
||||
},
|
||||
resources_per_trial=tune.PlacementGroupFactory(
|
||||
[{"CPU": 1}, {"GPU": num_gpus}] + [{"CPU": 1}] * num_workers
|
||||
),
|
||||
)
|
||||
).fit()
|
||||
|
||||
You could also use RLlib's callbacks API to update the environment on new training results:
|
||||
|
||||
|
@ -1418,13 +1419,13 @@ You could also use RLlib's callbacks API to update the environment on new traini
|
|||
lambda env: env.set_task(task)))
|
||||
|
||||
ray.init()
|
||||
tune.run(
|
||||
tune.Tuner(
|
||||
"PPO",
|
||||
config={
|
||||
param_space={
|
||||
"env": YourEnv,
|
||||
"callbacks": MyCallbacks,
|
||||
},
|
||||
)
|
||||
).fit()
|
||||
|
||||
Debugging
|
||||
---------
|
||||
|
|
|
@ -102,18 +102,6 @@ You can save and load checkpoint in Ray Tune in the following manner:
|
|||
.. note:: ``checkpoint_freq`` and ``checkpoint_at_end`` will not work with Function API checkpointing.
|
||||
|
||||
In this example, checkpoints will be saved by training iteration to ``local_dir/exp_name/trial_name/checkpoint_<step>``.
|
||||
You can restore a single trial checkpoint by using ``tune.run(restore=<checkpoint_dir>)``:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
analysis = tune.run(
|
||||
train,
|
||||
config={
|
||||
"max_iter": 5
|
||||
},
|
||||
).trials
|
||||
last_ckpt = trial.checkpoint.dir_or_data
|
||||
analysis = tune.run(train, config={"max_iter": 10}, restore=last_ckpt)
|
||||
|
||||
Tune also may copy or move checkpoints during the course of tuning. For this purpose,
|
||||
it is important not to depend on absolute paths in the implementation of ``save``.
|
||||
|
|
|
@ -256,23 +256,10 @@ In this example, checkpoints will be saved:
|
|||
* **On head node**: ``~/ray-results/my-tune-exp/<trial_name>/checkpoint_<step>`` (but only for trials done on that node)
|
||||
* **On workers nodes**: ``~/ray-results/my-tune-exp/<trial_name>/checkpoint_<step>`` (but only for trials done on that node)
|
||||
|
||||
If your run stopped for any reason (finished, errored, user CTRL+C), you can restart it any time by running the script above again -- note with ``resume="AUTO"``, it will detect the previous run so long as the ``sync_config`` points to the same location.
|
||||
|
||||
If, however, you prefer not to use ``resume="AUTO"`` (or are on an older version of Ray) you can resume manaully:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
# Restored previous trial from the given checkpoint
|
||||
tune.run(
|
||||
# our same trainable as before
|
||||
my_trainable,
|
||||
|
||||
# The name can be different from your original name
|
||||
name="my-tune-exp-restart",
|
||||
|
||||
# our same config as above!
|
||||
restore=sync_config,
|
||||
)
|
||||
If your run stopped for any reason (finished, errored, user CTRL+C), you can restart it any time by
|
||||
``tuner=Tuner.restore(experiment_checkpoint_dir).fit()``.
|
||||
There are a few options for restoring an experiment:
|
||||
"resume_unfinished", "resume_errored" and "restart_errored". See ``Tuner.restore()`` for more details.
|
||||
|
||||
.. _rsync-checkpointing:
|
||||
|
||||
|
|
|
@ -258,30 +258,9 @@ If the trial/actor is placed on a different node, Tune will automatically push t
|
|||
Recovering From Failures
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
Tune automatically persists the progress of your entire experiment (a ``Tuner.fit()`` session), so if an experiment crashes or is otherwise cancelled, it can be resumed by passing one of True, False, "LOCAL", "REMOTE", or "PROMPT" to ``tune.run(resume=...)``. Note that this only works if trial checkpoints are detected, whether it be by manual or periodic checkpointing.
|
||||
|
||||
**Settings:**
|
||||
|
||||
- The default setting of ``resume=False`` creates a new experiment.
|
||||
- ``resume="LOCAL"`` and ``resume=True`` restore the experiment from ``local_dir/[experiment_name]``.
|
||||
- ``resume="REMOTE"`` syncs the upload dir down to the local dir and then restores the experiment from ``local_dir/experiment_name``.
|
||||
- ``resume="ERRORED_ONLY"`` will look for errored trials in ``local_dir/[experiment_name]`` and only run these (and start from scratch).
|
||||
- ``resume="PROMPT"`` will cause Tune to prompt you for whether you want to resume. You can always force a new experiment to be created by changing the experiment name.
|
||||
- ``resume="AUTO"`` will automatically look for an existing experiment at ``local_dir/[experiment_name]``. If found, it will be continued (as if ``resume=True``), otherwise a new experiment is started.
|
||||
|
||||
Note that trials will be restored to their last checkpoint. If trial checkpointing is not enabled, unfinished trials will be restarted from scratch.
|
||||
|
||||
E.g.:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
tune.run(
|
||||
my_trainable, # Function trainable that saves checkpoints
|
||||
local_dir="~/path/to/results",
|
||||
resume=True
|
||||
)
|
||||
|
||||
Upon a second run, this will restore the entire experiment state from ``~/path/to/results/my_experiment_name``. Importantly, any changes to the experiment specification upon resume will be ignored. For example, if the previous experiment has reached its termination, then resuming it with a new stop criterion will not run. The new experiment will terminate immediately after initialization. If you want to change the configuration, such as training more iterations, you can do so restore the checkpoint by setting ``restore=<path-to-checkpoint>`` - note that this only works for a single trial.
|
||||
Tune automatically persists the progress of your entire experiment (a ``Tuner.fit()`` session), so if an experiment crashes or is otherwise cancelled, it can be resumed through ``Tuner.restore()``.
|
||||
There are a few options for restoring an experiment:
|
||||
"resume_unfinished", "resume_errored" and "restart_errored". See ``Tuner.restore()`` for more details.
|
||||
|
||||
.. _tune-distributed-common:
|
||||
|
||||
|
|
|
@ -33,27 +33,6 @@ You can specify the ``local_dir`` and ``trainable_name``:
|
|||
run_config=air.RunConfig(local_dir="./results", name="test_experiment"))
|
||||
results = tuner.fit()
|
||||
|
||||
To specify custom trial folder names, you can pass use the ``trial_name_creator`` argument to `tune.run`.
|
||||
This takes a function with the following signature:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
def trial_name_string(trial):
|
||||
"""
|
||||
Args:
|
||||
trial (Trial): A generated trial object.
|
||||
|
||||
Returns:
|
||||
trial_name (str): String representation of Trial.
|
||||
"""
|
||||
return str(trial)
|
||||
|
||||
tune.run(
|
||||
MyTrainableClass,
|
||||
name="example-experiment",
|
||||
num_samples=1,
|
||||
trial_name_creator=trial_name_string
|
||||
)
|
||||
|
||||
To learn more about Trials, see its detailed API documentation: :ref:`trial-docstring`.
|
||||
|
||||
|
@ -62,7 +41,7 @@ To learn more about Trials, see its detailed API documentation: :ref:`trial-docs
|
|||
How to log to TensorBoard?
|
||||
--------------------------
|
||||
|
||||
Tune automatically outputs TensorBoard files during ``tune.run``.
|
||||
Tune automatically outputs TensorBoard files during ``Tuner.fit()``.
|
||||
To visualize learning in tensorboard, install tensorboardX:
|
||||
|
||||
.. code-block:: bash
|
||||
|
|
|
@ -5,7 +5,7 @@ Ray Tune periodically checkpoints the experiment state so that it can be restart
|
|||
The checkpointing period is dynamically adjusted so that at least 95% of the time is used for handling
|
||||
training results and scheduling.
|
||||
|
||||
If you send a SIGINT signal to the process running ``tune.run()`` (which is
|
||||
If you send a SIGINT signal to the process running ``Tuner.fit()`` (which is
|
||||
usually what happens when you press Ctrl+C in the console), Ray Tune shuts
|
||||
down training gracefully and saves a final experiment-level checkpoint.
|
||||
|
||||
|
@ -17,24 +17,22 @@ How to resume a Tune run?
|
|||
-------------------------
|
||||
|
||||
If you've stopped a run and and want to resume from where you left off,
|
||||
you can then call ``tune.run()`` with ``resume=True`` like this:
|
||||
you can then call ``Tuner.restore()`` like this:
|
||||
|
||||
.. code-block:: python
|
||||
:emphasize-lines: 5
|
||||
:emphasize-lines: 4
|
||||
|
||||
tune.run(
|
||||
train,
|
||||
# other configuration
|
||||
name="my_experiment",
|
||||
resume=True
|
||||
tuner = Tuner.restore(
|
||||
path="~/ray_results/my_experiment"
|
||||
)
|
||||
tuner.fit()
|
||||
|
||||
You will have to pass a ``name`` if you are using ``resume=True`` so that Ray Tune can detect the experiment
|
||||
folder (which is usually stored at e.g. ``~/ray_results/my_experiment``).
|
||||
If you forgot to pass a name in the first call, you can still pass the name when you resume the run.
|
||||
Please note that in this case it is likely that your experiment name has a date suffix, so if you
|
||||
ran ``tune.run(my_trainable)``, the ``name`` might look like something like this:
|
||||
``my_trainable_2021-01-29_10-16-44``.
|
||||
There are a few options for restoring an experiment:
|
||||
"resume_unfinished", "resume_errored" and "restart_errored". See ``Tuner.restore()`` for more details.
|
||||
|
||||
``path`` here is determined by the ``air.RunConfig.name`` you supplied to your ``Tuner()``.
|
||||
If you didn't supply name to ``Tuner``, it is likely that your ``path`` looks something like:
|
||||
"~/ray_results/my_trainable_2021-01-29_10-16-44".
|
||||
|
||||
You can see which name you need to pass by taking a look at the results table
|
||||
of your original tuning run:
|
||||
|
@ -49,17 +47,13 @@ of your original tuning run:
|
|||
Result logdir: /Users/ray/ray_results/my_trainable_2021-01-29_10-16-44
|
||||
Number of trials: 1/1 (1 RUNNING)
|
||||
|
||||
Another useful option to know about is ``resume="AUTO"``, which will attempt to resume the experiment if possible,
|
||||
and otherwise will start a new experiment.
|
||||
For more details and other options for ``resume``, see the :ref:`Tune run API documentation <tune-run-ref>`.
|
||||
|
||||
.. _tune-stopping-ref:
|
||||
|
||||
How to stop Tune runs programmatically?
|
||||
---------------------------------------
|
||||
|
||||
We've just covered the case in which you manually interrupt a Tune run.
|
||||
But you can also control when trials are stopped early by passing the ``stop`` argument to ``tune.run``.
|
||||
But you can also control when trials are stopped early by passing the ``stop`` argument to ``Tuner``.
|
||||
This argument takes, a dictionary, a function, or a :class:`Stopper <ray.tune.stopper.Stopper>` class as an argument.
|
||||
|
||||
If a dictionary is passed in, the keys may be any field in the return result of ``session.report`` in the
|
||||
|
@ -75,10 +69,10 @@ These metrics are assumed to be **increasing**.
|
|||
.. code-block:: python
|
||||
|
||||
# training_iteration is an auto-filled metric by Tune.
|
||||
tune.run(
|
||||
tune.Tuner(
|
||||
my_trainable,
|
||||
stop={"training_iteration": 10, "mean_accuracy": 0.98}
|
||||
)
|
||||
run_config=air.RunConfig(stop={"training_iteration": 10, "mean_accuracy": 0.98})
|
||||
).fit()
|
||||
|
||||
Stopping with a function
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
@ -92,7 +86,7 @@ If a function is passed in, it must take ``(trial_id, result)`` as arguments and
|
|||
def stopper(trial_id, result):
|
||||
return result["mean_accuracy"] / result["training_iteration"] > 5
|
||||
|
||||
tune.run(my_trainable, stop=stopper)
|
||||
tune.Tuner(my_trainable, run_config=air.RunConfig(stop=stopper)).fit()
|
||||
|
||||
Stopping with a class
|
||||
~~~~~~~~~~~~~~~~~~~~~
|
||||
|
@ -117,7 +111,7 @@ Finally, you can implement the :class:`Stopper <ray.tune.stopper.Stopper>` abstr
|
|||
return self.should_stop
|
||||
|
||||
stopper = CustomStopper()
|
||||
tune.run(my_trainable, stop=stopper)
|
||||
tune.Tuner(my_trainable, run_config=air.RunConfig(stop=stopper)).fit()
|
||||
|
||||
|
||||
Note that in the above example the currently running trials will not stop immediately but will do so
|
||||
|
@ -129,11 +123,11 @@ Ray Tune comes with a set of out-of-the-box stopper classes. See the :ref:`Stopp
|
|||
Stopping after the first failure
|
||||
--------------------------------
|
||||
|
||||
By default, ``tune.run`` will continue executing until all trials have terminated or errored.
|
||||
By default, ``Tuner.fit()`` will continue executing until all trials have terminated or errored.
|
||||
To stop the entire Tune run as soon as **any** trial errors:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
tune.run(trainable, fail_fast=True)
|
||||
tune.Tuner(trainable, run_config=air.RunConfig(failure_config=air.FailureConfig(fail_fast=True))).fit()
|
||||
|
||||
This is useful when you are trying to setup a large hyperparameter experiment.
|
||||
|
|
Loading…
Add table
Reference in a new issue