ray/doc/source/example-policy-gradient.rst

Policy Gradient Methods
=======================

This code shows how to do reinforcement learning with policy gradient methods.
View the `code for this example`_.

To run this example, you will need to install `TensorFlow with GPU support`_ (at
least version ``1.0.0``) and a few other dependencies.

.. code-block:: bash

  pip install gym[atari]
  pip install tensorflow

Then install the package as follows.

.. code-block:: bash

  cd ray/examples/policy_gradient/
  python setup.py install

Then you can run the example as follows.

.. code-block:: bash

  python ray/examples/policy_gradient/examples/example.py --environment=Pong-ram-v3

This will train an agent on the ``Pong-ram-v3`` Atari environment. You can also
try passing in the ``Pong-v0`` environment or the ``CartPole-v0`` environment.
If you wish to use a different environment, you will need to change a few lines
in ``example.py``.

Current and historical training progress can be monitored by pointing
TensorBoard to the log output directory as follows.

.. code-block:: bash

  tensorboard --logdir=/tmp/ray

Many of the TensorBoard metrics are also printed to the console, but you might
find it easier to visualize and compare between runs using the TensorBoard UI.

.. _`TensorFlow with GPU support`: https://www.tensorflow.org/install/
.. _`code for this example`: https://github.com/ray-project/ray/tree/master/examples/policy_gradient
Add policy gradient example. (#344) * add policy gradient example * fix typos * Minor changes plus some documentation. * Minor fixes. 2017-03-07 23:42:44 -08:00			`Policy Gradient Methods`
			`=======================`

			`This code shows how to do reinforcement learning with policy gradient methods.`
			View the `code for this example`_.

			To run this example, you will need to install `TensorFlow with GPU support`_ (at
			least version ``1.0.0``) and a few other dependencies.

			`.. code-block:: bash`

			`pip install gym[atari]`
			`pip install tensorflow`

			`Then install the package as follows.`

			`.. code-block:: bash`

			`cd ray/examples/policy_gradient/`
			`python setup.py install`

			`Then you can run the example as follows.`

			`.. code-block:: bash`

Atari on pixels (#364) * pong on pixels working (not cleaned up) * make training compatible with all atari games * cartpole runs * Update documentation and usage for policy gradients. 2017-03-14 13:31:29 -07:00			`python ray/examples/policy_gradient/examples/example.py --environment=Pong-ram-v3`
Add policy gradient example. (#344) * add policy gradient example * fix typos * Minor changes plus some documentation. * Minor fixes. 2017-03-07 23:42:44 -08:00
Atari on pixels (#364) * pong on pixels working (not cleaned up) * make training compatible with all atari games * cartpole runs * Update documentation and usage for policy gradients. 2017-03-14 13:31:29 -07:00			This will train an agent on the ``Pong-ram-v3`` Atari environment. You can also
			try passing in the ``Pong-v0`` environment or the ``CartPole-v0`` environment.
			`If you wish to use a different environment, you will need to change a few lines`
			in ``example.py``.
Add policy gradient example. (#344) * add policy gradient example * fix typos * Minor changes plus some documentation. * Minor fixes. 2017-03-07 23:42:44 -08:00
Policy gradient example: record stats for tensorboard (#577) * add tf metrics * comments * fix network scopes * add doc * use format string * fix trace level * plot intermediate and final sgd stats * add back a global step 2017-05-21 14:51:24 -07:00			`Current and historical training progress can be monitored by pointing`
			`TensorBoard to the log output directory as follows.`

			`.. code-block:: bash`

			`tensorboard --logdir=/tmp/ray`

			`Many of the TensorBoard metrics are also printed to the console, but you might`
			`find it easier to visualize and compare between runs using the TensorBoard UI.`

Add policy gradient example. (#344) * add policy gradient example * fix typos * Minor changes plus some documentation. * Minor fixes. 2017-03-07 23:42:44 -08:00			.. _`TensorFlow with GPU support`: https://www.tensorflow.org/install/
			.. _`code for this example`: https://github.com/ray-project/ray/tree/master/examples/policy_gradient