[docs] Switch docs to use rllib train instead of train.py

This commit is contained in:
Eric Liang 2018-12-04 17:36:06 -08:00 committed by GitHub
parent 9d0bd50e78
commit 93a9d32288
No known key found for this signature in database
GPG key ID: 4AEE18F83AFDEB23
6 changed files with 23 additions and 22 deletions

View file

@ -41,12 +41,12 @@ Example Use
Ray comes with libraries that accelerate deep learning and reinforcement learning development:
- `Ray Tune`_: Hyperparameter Optimization Framework
- `Ray RLlib`_: Scalable Reinforcement Learning
- `Tune`_: Hyperparameter Optimization Framework
- `RLlib`_: Scalable Reinforcement Learning
- `Distributed Training <http://ray.readthedocs.io/en/latest/distributed_sgd.html>`__
.. _`Ray Tune`: http://ray.readthedocs.io/en/latest/tune.html
.. _`Ray RLlib`: http://ray.readthedocs.io/en/latest/rllib.html
.. _`Tune`: http://ray.readthedocs.io/en/latest/tune.html
.. _`RLlib`: http://ray.readthedocs.io/en/latest/rllib.html
Installation
------------

View file

@ -29,7 +29,7 @@ You can run the code with
.. code-block:: bash
python/ray/rllib/train.py --env=Pong-ram-v4 --run=A3C --config='{"num_workers": N}'
rllib train --env=Pong-ram-v4 --run=A3C --config='{"num_workers": N}'
Reinforcement Learning
----------------------

View file

@ -18,13 +18,13 @@ on the ``Humanoid-v1`` gym environment.
.. code-block:: bash
python/ray/rllib/train.py --env=Humanoid-v1 --run=ES
rllib train --env=Humanoid-v1 --run=ES
To train a policy on a cluster (e.g., using 900 workers), run the following.
.. code-block:: bash
python ray/python/ray/rllib/train.py \
rllib train \
--env=Humanoid-v1 \
--run=ES \
--redis-address=<redis-address> \

View file

@ -21,7 +21,7 @@ Then you can run the example as follows.
.. code-block:: bash
python/ray/rllib/train.py --env=Pong-ram-v4 --run=PPO
rllib train --env=Pong-ram-v4 --run=PPO
This will train an agent on the ``Pong-ram-v4`` Atari environment. You can also
try passing in the ``Pong-v0`` environment or the ``CartPole-v0`` environment.

View file

@ -10,11 +10,11 @@ be trained, checkpointed, or an action computed.
.. image:: rllib-api.svg
You can train a simple DQN agent with the following command
You can train a simple DQN agent with the following command:
.. code-block:: bash
python ray/python/ray/rllib/train.py --run DQN --env CartPole-v0
rllib train --run DQN --env CartPole-v0
By default, the results will be logged to a subdirectory of ``~/ray_results``.
This subdirectory will contain a file ``params.json`` which contains the
@ -26,10 +26,12 @@ training process with TensorBoard by running
tensorboard --logdir=~/ray_results
The ``train.py`` script has a number of options you can show by running
The ``rllib train`` command (same as the ``train.py`` script in the repo) has a number of options you can show by running:
.. code-block:: bash
rllib train --help
-or-
python ray/python/ray/rllib/train.py --help
The most important options are for choosing the environment
@ -42,16 +44,16 @@ Evaluating Trained Agents
In order to save checkpoints from which to evaluate agents,
set ``--checkpoint-freq`` (number of training iterations between checkpoints)
when running ``train.py``.
when running ``rllib train``.
An example of evaluating a previously trained DQN agent is as follows:
.. code-block:: bash
python ray/python/ray/rllib/rollout.py \
~/ray_results/default/DQN_CartPole-v0_0upjmdgr0/checkpoint_1/checkpoint-1 \
--run DQN --env CartPole-v0 --steps 10000
rllib rollout \
~/ray_results/default/DQN_CartPole-v0_0upjmdgr0/checkpoint_1/checkpoint-1 \
--run DQN --env CartPole-v0 --steps 10000
The ``rollout.py`` helper script reconstructs a DQN agent from the checkpoint
located at ``~/ray_results/default/DQN_CartPole-v0_0upjmdgr0/checkpoint_1/checkpoint-1``
@ -70,8 +72,7 @@ In an example below, we train A2C by specifying 8 workers through the config fla
.. code-block:: bash
python ray/python/ray/rllib/train.py --env=PongDeterministic-v4 \
--run=A2C --config '{"num_workers": 8}'
rllib train --env=PongDeterministic-v4 --run=A2C --config '{"num_workers": 8}'
Specifying Resources
~~~~~~~~~~~~~~~~~~~~
@ -98,11 +99,11 @@ Some good hyperparameters and settings are available in
(some of them are tuned to run on GPUs). If you find better settings or tune
an algorithm on a different domain, consider submitting a Pull Request!
You can run these with the ``train.py`` script as follows:
You can run these with the ``rllib train`` command as follows:
.. code-block:: bash
python ray/python/ray/rllib/train.py -f /path/to/tuned/example.yaml
rllib train -f /path/to/tuned/example.yaml
Python API
----------
@ -356,7 +357,7 @@ The ``"monitor": true`` config can be used to save Gym episode videos to the res
.. code-block:: bash
python ray/python/ray/rllib/train.py --env=PongDeterministic-v4 \
rllib train --env=PongDeterministic-v4 \
--run=A2C --config '{"num_workers": 2, "monitor": true}'
# videos will be saved in the ~/ray_results/<experiment> dir, for example
@ -372,7 +373,7 @@ You can control the agent log level via the ``"log_level"`` flag. Valid values a
.. code-block:: bash
python ray/python/ray/rllib/train.py --env=PongDeterministic-v4 \
rllib train --env=PongDeterministic-v4 \
--run=A2C --config '{"num_workers": 2, "log_level": "DEBUG"}'
Stack Traces

View file

@ -14,7 +14,7 @@ Example usage for training:
rllib train --run DQN --env CartPole-v0
Example usage for rollout:
rllib rollout /tmp/ray/checkpoint_dir/checkpoint-0 --run DQN
rllib rollout /trial_dir/checkpoint_1/checkpoint-1 --run DQN
"""