mirror of
https://github.com/vale981/ray
synced 2025-03-05 18:11:42 -05:00
[docs] Switch docs to use rllib train instead of train.py
This commit is contained in:
parent
9d0bd50e78
commit
93a9d32288
6 changed files with 23 additions and 22 deletions
|
@ -41,12 +41,12 @@ Example Use
|
|||
|
||||
Ray comes with libraries that accelerate deep learning and reinforcement learning development:
|
||||
|
||||
- `Ray Tune`_: Hyperparameter Optimization Framework
|
||||
- `Ray RLlib`_: Scalable Reinforcement Learning
|
||||
- `Tune`_: Hyperparameter Optimization Framework
|
||||
- `RLlib`_: Scalable Reinforcement Learning
|
||||
- `Distributed Training <http://ray.readthedocs.io/en/latest/distributed_sgd.html>`__
|
||||
|
||||
.. _`Ray Tune`: http://ray.readthedocs.io/en/latest/tune.html
|
||||
.. _`Ray RLlib`: http://ray.readthedocs.io/en/latest/rllib.html
|
||||
.. _`Tune`: http://ray.readthedocs.io/en/latest/tune.html
|
||||
.. _`RLlib`: http://ray.readthedocs.io/en/latest/rllib.html
|
||||
|
||||
Installation
|
||||
------------
|
||||
|
|
|
@ -29,7 +29,7 @@ You can run the code with
|
|||
|
||||
.. code-block:: bash
|
||||
|
||||
python/ray/rllib/train.py --env=Pong-ram-v4 --run=A3C --config='{"num_workers": N}'
|
||||
rllib train --env=Pong-ram-v4 --run=A3C --config='{"num_workers": N}'
|
||||
|
||||
Reinforcement Learning
|
||||
----------------------
|
||||
|
|
|
@ -18,13 +18,13 @@ on the ``Humanoid-v1`` gym environment.
|
|||
|
||||
.. code-block:: bash
|
||||
|
||||
python/ray/rllib/train.py --env=Humanoid-v1 --run=ES
|
||||
rllib train --env=Humanoid-v1 --run=ES
|
||||
|
||||
To train a policy on a cluster (e.g., using 900 workers), run the following.
|
||||
|
||||
.. code-block:: bash
|
||||
|
||||
python ray/python/ray/rllib/train.py \
|
||||
rllib train \
|
||||
--env=Humanoid-v1 \
|
||||
--run=ES \
|
||||
--redis-address=<redis-address> \
|
||||
|
|
|
@ -21,7 +21,7 @@ Then you can run the example as follows.
|
|||
|
||||
.. code-block:: bash
|
||||
|
||||
python/ray/rllib/train.py --env=Pong-ram-v4 --run=PPO
|
||||
rllib train --env=Pong-ram-v4 --run=PPO
|
||||
|
||||
This will train an agent on the ``Pong-ram-v4`` Atari environment. You can also
|
||||
try passing in the ``Pong-v0`` environment or the ``CartPole-v0`` environment.
|
||||
|
|
|
@ -10,11 +10,11 @@ be trained, checkpointed, or an action computed.
|
|||
|
||||
.. image:: rllib-api.svg
|
||||
|
||||
You can train a simple DQN agent with the following command
|
||||
You can train a simple DQN agent with the following command:
|
||||
|
||||
.. code-block:: bash
|
||||
|
||||
python ray/python/ray/rllib/train.py --run DQN --env CartPole-v0
|
||||
rllib train --run DQN --env CartPole-v0
|
||||
|
||||
By default, the results will be logged to a subdirectory of ``~/ray_results``.
|
||||
This subdirectory will contain a file ``params.json`` which contains the
|
||||
|
@ -26,10 +26,12 @@ training process with TensorBoard by running
|
|||
|
||||
tensorboard --logdir=~/ray_results
|
||||
|
||||
The ``train.py`` script has a number of options you can show by running
|
||||
The ``rllib train`` command (same as the ``train.py`` script in the repo) has a number of options you can show by running:
|
||||
|
||||
.. code-block:: bash
|
||||
|
||||
rllib train --help
|
||||
-or-
|
||||
python ray/python/ray/rllib/train.py --help
|
||||
|
||||
The most important options are for choosing the environment
|
||||
|
@ -42,16 +44,16 @@ Evaluating Trained Agents
|
|||
|
||||
In order to save checkpoints from which to evaluate agents,
|
||||
set ``--checkpoint-freq`` (number of training iterations between checkpoints)
|
||||
when running ``train.py``.
|
||||
when running ``rllib train``.
|
||||
|
||||
|
||||
An example of evaluating a previously trained DQN agent is as follows:
|
||||
|
||||
.. code-block:: bash
|
||||
|
||||
python ray/python/ray/rllib/rollout.py \
|
||||
~/ray_results/default/DQN_CartPole-v0_0upjmdgr0/checkpoint_1/checkpoint-1 \
|
||||
--run DQN --env CartPole-v0 --steps 10000
|
||||
rllib rollout \
|
||||
~/ray_results/default/DQN_CartPole-v0_0upjmdgr0/checkpoint_1/checkpoint-1 \
|
||||
--run DQN --env CartPole-v0 --steps 10000
|
||||
|
||||
The ``rollout.py`` helper script reconstructs a DQN agent from the checkpoint
|
||||
located at ``~/ray_results/default/DQN_CartPole-v0_0upjmdgr0/checkpoint_1/checkpoint-1``
|
||||
|
@ -70,8 +72,7 @@ In an example below, we train A2C by specifying 8 workers through the config fla
|
|||
|
||||
.. code-block:: bash
|
||||
|
||||
python ray/python/ray/rllib/train.py --env=PongDeterministic-v4 \
|
||||
--run=A2C --config '{"num_workers": 8}'
|
||||
rllib train --env=PongDeterministic-v4 --run=A2C --config '{"num_workers": 8}'
|
||||
|
||||
Specifying Resources
|
||||
~~~~~~~~~~~~~~~~~~~~
|
||||
|
@ -98,11 +99,11 @@ Some good hyperparameters and settings are available in
|
|||
(some of them are tuned to run on GPUs). If you find better settings or tune
|
||||
an algorithm on a different domain, consider submitting a Pull Request!
|
||||
|
||||
You can run these with the ``train.py`` script as follows:
|
||||
You can run these with the ``rllib train`` command as follows:
|
||||
|
||||
.. code-block:: bash
|
||||
|
||||
python ray/python/ray/rllib/train.py -f /path/to/tuned/example.yaml
|
||||
rllib train -f /path/to/tuned/example.yaml
|
||||
|
||||
Python API
|
||||
----------
|
||||
|
@ -356,7 +357,7 @@ The ``"monitor": true`` config can be used to save Gym episode videos to the res
|
|||
|
||||
.. code-block:: bash
|
||||
|
||||
python ray/python/ray/rllib/train.py --env=PongDeterministic-v4 \
|
||||
rllib train --env=PongDeterministic-v4 \
|
||||
--run=A2C --config '{"num_workers": 2, "monitor": true}'
|
||||
|
||||
# videos will be saved in the ~/ray_results/<experiment> dir, for example
|
||||
|
@ -372,7 +373,7 @@ You can control the agent log level via the ``"log_level"`` flag. Valid values a
|
|||
|
||||
.. code-block:: bash
|
||||
|
||||
python ray/python/ray/rllib/train.py --env=PongDeterministic-v4 \
|
||||
rllib train --env=PongDeterministic-v4 \
|
||||
--run=A2C --config '{"num_workers": 2, "log_level": "DEBUG"}'
|
||||
|
||||
Stack Traces
|
||||
|
|
|
@ -14,7 +14,7 @@ Example usage for training:
|
|||
rllib train --run DQN --env CartPole-v0
|
||||
|
||||
Example usage for rollout:
|
||||
rllib rollout /tmp/ray/checkpoint_dir/checkpoint-0 --run DQN
|
||||
rllib rollout /trial_dir/checkpoint_1/checkpoint-1 --run DQN
|
||||
"""
|
||||
|
||||
|
||||
|
|
Loading…
Add table
Reference in a new issue