ray/doc/source/rllib/index.rst

.. include:: /_includes/rllib/announcement.rst

.. include:: /_includes/rllib/we_are_hiring.rst

.. _rllib-index:

RLlib: Industry-Grade Reinforcement Learning
============================================

.. image:: images/rllib-logo.png
    :align: center

**RLlib** is an open-source library for reinforcement learning (RL), offering support for
production-level, highly distributed RL workloads while maintaining
unified and simple APIs for a large variety of industry applications.
Whether you would like to train your agents in a **multi-agent** setup,
purely from **offline** (historic) datasets, or using **externally
connected simulators**, RLlib offers a simple solution for each of your decision
making needs.

If you either have your problem coded (in python) as an
`RL environment <rllib-env.html#configuring-environments>`_
or own lots of pre-recorded, historic behavioral data to learn from, you will be
up and running in only a few days.

RLlib is already used in production by industry leaders in many different verticals, such as
`climate control <https://www.anyscale.com/events/2021/06/23/applying-ray-and-rllib-to-real-life-industrial-use-cases>`_,
`industrial control <https://www.anyscale.com/events/2021/06/23/applying-ray-and-rllib-to-real-life-industrial-use-cases>`_,
`manufacturing and logistics <https://www.anyscale.com/events/2022/03/29/alphadow-leveraging-rays-ecosystem-to-train-and-deploy-an-rl-industrial>`_,
`finance <https://www.anyscale.com/events/2021/06/22/a-24x-speedup-for-reinforcement-learning-with-rllib-+-ray>`_,
`gaming <https://www.anyscale.com/events/2021/06/22/using-reinforcement-learning-to-optimize-iap-offer-recommendations-in-mobile-games>`_,
`automobile <https://www.anyscale.com/events/2021/06/23/using-rllib-in-an-enterprise-scale-reinforcement-learning-solution>`_,
`robotics <https://www.anyscale.com/events/2021/06/23/introducing-amazon-sagemaker-kubeflow-reinforcement-learning-pipelines-for>`_,
`boat design <https://www.youtube.com/watch?v=cLCK13ryTpw>`_,
and many others.

RLlib in 60 seconds
-------------------

.. figure:: images/rllib-index-header.svg


It only takes a few steps to get your first RLlib workload up and running on your laptop:

**TensorFlow or PyTorch**:

RLlib does not automatically install a deep-learning framework, but supports
TensorFlow (both 1.x with static-graph and 2.x with eager mode) as well as
PyTorch. Depending on your needs, make sure to install either TensorFlow or
PyTorch (or both as shown below):

.. code-block:: bash

    $ conda create -n rllib python=3.8
    $ conda activate rllib
    $ pip install "ray[rllib]" tensorflow torch

Note, for installation on computers running Apple Silicon (such as M1), please follow instructions
`here <https://docs.ray.io/en/latest/installation.html#m1-mac-apple-silicon-support>`_

To be able to run our Atari examples, you should also install:

.. code-block:: bash

    $ pip install "gym[atari]" "gym[accept-rom-license]" atari_py

After these quick pip installs, you can start coding against RLlib.

Here is an example of running a PPO Trainer on the `Taxi domain <https://www.gymlibrary.ml/environments/toy_text/taxi/>`_
for a few training iterations, then perform a single evaluation loop
(with rendering enabled):

.. literalinclude:: ../../../rllib/examples/documentation/rllib_in_60s.py
    :language: python
    :start-after: __rllib-in-60s-begin__
    :end-before: __rllib-in-60s-end__

`See here for a simple example on how to write an action inference loop after training. <https://github.com/ray-project/ray/blob/master/rllib/examples/inference_and_serving/policy_inference_after_training.py>`_

Feature Overview
----------------

You can read about: 

- `RLlib Key Concepts <core-concepts.html>`_

- `RLlib Algorithms <rllib-algorithms.html>`_

- `RLlib Environments <rllib-env.html>`_

The following is a summary of RLlib's most striking features.
Click on the images below to see an example script for each of the listed features:

.. include:: feature_overview.rst

If you want to get a quick preview of which **algorithms** and **environments** RLlib supports,
click on the dropdowns below:

.. dropdown:: **RLlib Algorithms**
    :animate: fade-in-slide-down

    *  High-throughput architectures

       -  |pytorch| |tensorflow| :ref:`Distributed Prioritized Experience Replay (Ape-X) <apex>`

       -  |pytorch| |tensorflow| :ref:`Importance Weighted Actor-Learner Architecture (IMPALA) <impala>`

       -  |pytorch| |tensorflow| :ref:`Asynchronous Proximal Policy Optimization (APPO) <appo>`

       -  |pytorch| :ref:`Decentralized Distributed Proximal Policy Optimization (DD-PPO) <ddppo>`

    *  Gradient-based

       -  |pytorch| |tensorflow| :ref:`Advantage Actor-Critic (A2C, A3C) <a3c>`

       -  |pytorch| |tensorflow| :ref:`Deep Deterministic Policy Gradients (DDPG, TD3) <ddpg>`

       -  |pytorch| |tensorflow| :ref:`Deep Q Networks (DQN, Rainbow, Parametric DQN) <dqn>`

       -  |pytorch| |tensorflow| :ref:`Policy Gradients <pg>`

       -  |pytorch| |tensorflow| :ref:`Proximal Policy Optimization (PPO) <ppo>`

       -  |pytorch| |tensorflow| :ref:`Soft Actor Critic (SAC) <sac>`

       -  |pytorch| :ref:`Slate Q-Learning (SlateQ) <slateq>`

    *  Derivative-free

       -  |pytorch| |tensorflow| :ref:`Augmented Random Search (ARS) <ars>`

       -  |pytorch| |tensorflow| :ref:`Evolution Strategies <es>`

    *  Model-based / Meta-learning / Offline

       -  |pytorch| :ref:`Single-Player AlphaZero (contrib/AlphaZero) <alphazero>`

       -  |pytorch| |tensorflow| :ref:`Model-Agnostic Meta-Learning (MAML) <maml>`

       -  |pytorch| :ref:`Model-Based Meta-Policy-Optimization (MBMPO) <mbmpo>`

       -  |pytorch| :ref:`Dreamer (DREAMER) <dreamer>`

       -  |pytorch| :ref:`Conservative Q-Learning (CQL) <cql>`

    *  Multi-agent

       -  |pytorch| :ref:`QMIX Monotonic Value Factorisation (QMIX, VDN, IQN) <qmix>`
       -  |tensorflow| :ref:`Multi-Agent Deep Deterministic Policy Gradient (contrib/MADDPG) <maddpg>`

    *  Offline

       -  |pytorch| |tensorflow| :ref:`Advantage Re-Weighted Imitation Learning (MARWIL) <marwil>`

    *  Contextual bandits

       -  |pytorch| :ref:`Linear Upper Confidence Bound (contrib/LinUCB) <lin-ucb>`
       -  |pytorch| :ref:`Linear Thompson Sampling (contrib/LinTS) <lints>`

    *  Exploration-based plug-ins (can be combined with any algo)

       -  |pytorch| :ref:`Curiosity (ICM: Intrinsic Curiosity Module) <curiosity>`


.. dropdown:: **RLlib Environments**
    :animate: fade-in-slide-down

    *  `RLlib Environments Overview <rllib-env.html>`__
    *  `OpenAI Gym <rllib-env.html#openai-gym>`__
    *  `Vectorized <rllib-env.html#vectorized>`__
    *  `Multi-Agent and Hierarchical <rllib-env.html#multi-agent-and-hierarchical>`__
    *  `External Agents and Applications <rllib-env.html#external-agents-and-applications>`__

       -  `External Application Clients <rllib-env.html#external-application-clients>`__

    *  `Advanced Integrations <rllib-env.html#advanced-integrations>`__


Customizing RLlib
-----------------

RLlib provides simple APIs to customize all aspects of your training- and experimental workflows.
For example, you may code your own `environments <rllib-env.html#configuring-environments>`__
in python using openAI's gym or DeepMind's OpenSpiel, provide custom
`TensorFlow/Keras- <rllib-models.html#tensorflow-models>`__ or ,
`Torch models <rllib-models.html#torch-models>`_, write your own
`policy- and loss definitions <rllib-concepts.html#policies>`__, or define
custom `exploratory behavior <rllib-training.html#exploration-api>`_.

Via mapping one or more agents in your environments to (one or more) policies, multi-agent
RL (MARL) becomes an easy-to-use low-level primitive for our users.

.. figure:: images/rllib-stack.svg
    :align: left
    :width: 650

    **RLlib's API stack:** Built on top of Ray, RLlib offers off-the-shelf, highly distributed
    algorithms, policies, loss functions, and default models (including the option to
    auto-wrap a neural network with an LSTM or an attention net). Furthermore, our library
    comes with a built-in Server/Client setup, allowing you to connect
    hundreds of external simulators (clients) via the network to an RLlib server process,
    which provides learning functionality and serves action queries. User customizations
    are realized via sub-classing the existing abstractions and - by overriding certain
    methods in those sub-classes - define custom behavior.


.. |tensorflow| image:: images/tensorflow.png
    :class: inline-figure
    :width: 16

.. |pytorch| image:: images/pytorch.png
    :class: inline-figure
    :width: 16
[docs] external promo content (#22823) 2022-03-10 20:39:44 +01:00			`.. include:: /_includes/rllib/announcement.rst`
[docs] RLlib concepts consolidation, user guide, RL conf prep (#22496) 2022-02-18 18:35:20 +01:00
[docs] external promo content (#22823) 2022-03-10 20:39:44 +01:00			`.. include:: /_includes/rllib/we_are_hiring.rst`
[RLlib; documentation] "RLlib in 60sec" overhaul. (#20215) 2021-11-10 22:20:06 +01:00
			`.. _rllib-index:`

[RLlib; Documentation] Minor fixes "rllib in 60s" and per-feature sigils. (#20248) 2021-11-13 22:10:47 +01:00			`RLlib: Industry-Grade Reinforcement Learning`
			`============================================`
[RLlib; documentation] "RLlib in 60sec" overhaul. (#20215) 2021-11-10 22:20:06 +01:00
[docs] RLlib concepts consolidation, user guide, RL conf prep (#22496) 2022-02-18 18:35:20 +01:00			`.. image:: images/rllib-logo.png`
			`:align: center`
[RLlib; documentation] "RLlib in 60sec" overhaul. (#20215) 2021-11-10 22:20:06 +01:00
			`RLlib is an open-source library for reinforcement learning (RL), offering support for`
			`production-level, highly distributed RL workloads while maintaining`
			`unified and simple APIs for a large variety of industry applications.`
Add Environments to Key Concepts page (#25791) 2022-06-29 16:10:49 -07:00			`Whether you would like to train your agents in a multi-agent setup,`
			`purely from offline (historic) datasets, or using **externally`
			`connected simulators**, RLlib offers a simple solution for each of your decision`
[RLlib; documentation] "RLlib in 60sec" overhaul. (#20215) 2021-11-10 22:20:06 +01:00			`making needs.`

Add Environments to Key Concepts page (#25791) 2022-06-29 16:10:49 -07:00			`If you either have your problem coded (in python) as an`
			`RL environment <rllib-env.html#configuring-environments>`_
[RLlib; Documentation] RLlib README overhaul. (#20249) 2021-11-18 18:08:40 +01:00			`or own lots of pre-recorded, historic behavioral data to learn from, you will be`
			`up and running in only a few days.`

			`RLlib is already used in production by industry leaders in many different verticals, such as`
			`climate control <https://www.anyscale.com/events/2021/06/23/applying-ray-and-rllib-to-real-life-industrial-use-cases>`_,
Add Environments to Key Concepts page (#25791) 2022-06-29 16:10:49 -07:00			`industrial control <https://www.anyscale.com/events/2021/06/23/applying-ray-and-rllib-to-real-life-industrial-use-cases>`_,
			`manufacturing and logistics <https://www.anyscale.com/events/2022/03/29/alphadow-leveraging-rays-ecosystem-to-train-and-deploy-an-rl-industrial>`_,
[RLlib; Documentation] RLlib README overhaul. (#20249) 2021-11-18 18:08:40 +01:00			`finance <https://www.anyscale.com/events/2021/06/22/a-24x-speedup-for-reinforcement-learning-with-rllib-+-ray>`_,
			`gaming <https://www.anyscale.com/events/2021/06/22/using-reinforcement-learning-to-optimize-iap-offer-recommendations-in-mobile-games>`_,
			`automobile <https://www.anyscale.com/events/2021/06/23/using-rllib-in-an-enterprise-scale-reinforcement-learning-solution>`_,
			`robotics <https://www.anyscale.com/events/2021/06/23/introducing-amazon-sagemaker-kubeflow-reinforcement-learning-pipelines-for>`_,
			`boat design <https://www.youtube.com/watch?v=cLCK13ryTpw>`_,
			`and many others.`
[RLlib; documentation] "RLlib in 60sec" overhaul. (#20215) 2021-11-10 22:20:06 +01:00
			`RLlib in 60 seconds`
			`-------------------`

[docs] RLlib concepts consolidation, user guide, RL conf prep (#22496) 2022-02-18 18:35:20 +01:00			`.. figure:: images/rllib-index-header.svg`


			`It only takes a few steps to get your first RLlib workload up and running on your laptop:`
[RLlib; documentation] "RLlib in 60sec" overhaul. (#20215) 2021-11-10 22:20:06 +01:00
			`TensorFlow or PyTorch:`

			`RLlib does not automatically install a deep-learning framework, but supports`
			`TensorFlow (both 1.x with static-graph and 2.x with eager mode) as well as`
			`PyTorch. Depending on your needs, make sure to install either TensorFlow or`
			`PyTorch (or both as shown below):`

			`.. code-block:: bash`

Add Environments to Key Concepts page (#25791) 2022-06-29 16:10:49 -07:00			`$ conda create -n rllib python=3.8`
			`$ conda activate rllib`
			`$ pip install "ray[rllib]" tensorflow torch`
[RLlib; documentation] "RLlib in 60sec" overhaul. (#20215) 2021-11-10 22:20:06 +01:00
[docs] RLlib broken links (fixes #23160) (#23226) 2022-03-16 12:38:18 +01:00			`Note, for installation on computers running Apple Silicon (such as M1), please follow instructions`
			`here <https://docs.ray.io/en/latest/installation.html#m1-mac-apple-silicon-support>`_
[RLlib] Docs enhancements: Setup-dev instructions; Ray datasets integration. (#22239) 2022-02-15 00:09:24 -08:00
[RLlib; Documentation] Minor fixes "rllib in 60s" and per-feature sigils. (#20248) 2021-11-13 22:10:47 +01:00			`To be able to run our Atari examples, you should also install:`
[RLlib; documentation] "RLlib in 60sec" overhaul. (#20215) 2021-11-10 22:20:06 +01:00
			`.. code-block:: bash`

Add Environments to Key Concepts page (#25791) 2022-06-29 16:10:49 -07:00			`$ pip install "gym[atari]" "gym[accept-rom-license]" atari_py`
[RLlib; documentation] "RLlib in 60sec" overhaul. (#20215) 2021-11-10 22:20:06 +01:00
			`After these quick pip installs, you can start coding against RLlib.`

[RLlib] Some Docs fixes (2). (#26265) 2022-07-05 06:46:32 -07:00			Here is an example of running a PPO Trainer on the `Taxi domain <https://www.gymlibrary.ml/environments/toy_text/taxi/>`_
[RLlib; documentation] "RLlib in 60sec" overhaul. (#20215) 2021-11-10 22:20:06 +01:00			`for a few training iterations, then perform a single evaluation loop`
			`(with rendering enabled):`

			`.. literalinclude:: ../../../rllib/examples/documentation/rllib_in_60s.py`
			`:language: python`
			`:start-after: __rllib-in-60s-begin__`
			`:end-before: __rllib-in-60s-end__`

[RLlib; Documentation] Minor fixes "rllib in 60s" and per-feature sigils. (#20248) 2021-11-13 22:10:47 +01:00			`See here for a simple example on how to write an action inference loop after training. <https://github.com/ray-project/ray/blob/master/rllib/examples/inference_and_serving/policy_inference_after_training.py>`_

[RLlib; documentation] "RLlib in 60sec" overhaul. (#20215) 2021-11-10 22:20:06 +01:00			`Feature Overview`
			`----------------`

Add Environments to Key Concepts page (#25791) 2022-06-29 16:10:49 -07:00			`You can read about:`

			- `RLlib Key Concepts <core-concepts.html>`_

			- `RLlib Algorithms <rllib-algorithms.html>`_

			- `RLlib Environments <rllib-env.html>`_

[RLlib; Documentation] Minor fixes "rllib in 60s" and per-feature sigils. (#20248) 2021-11-13 22:10:47 +01:00			`The following is a summary of RLlib's most striking features.`
			`Click on the images below to see an example script for each of the listed features:`

[RLlib; Documentation] RLlib README overhaul. (#20249) 2021-11-18 18:08:40 +01:00			`.. include:: feature_overview.rst`
[RLlib; documentation] "RLlib in 60sec" overhaul. (#20215) 2021-11-10 22:20:06 +01:00
[docs] RLlib concepts consolidation, user guide, RL conf prep (#22496) 2022-02-18 18:35:20 +01:00			`If you want to get a quick preview of which algorithms and environments RLlib supports,`
			`click on the dropdowns below:`

			`.. dropdown:: RLlib Algorithms`
			`:animate: fade-in-slide-down`

			`* High-throughput architectures`

			- \|pytorch\| \|tensorflow\| :ref:`Distributed Prioritized Experience Replay (Ape-X) <apex>`

			- \|pytorch\| \|tensorflow\| :ref:`Importance Weighted Actor-Learner Architecture (IMPALA) <impala>`

			- \|pytorch\| \|tensorflow\| :ref:`Asynchronous Proximal Policy Optimization (APPO) <appo>`

			- \|pytorch\| :ref:`Decentralized Distributed Proximal Policy Optimization (DD-PPO) <ddppo>`

			`* Gradient-based`

			- \|pytorch\| \|tensorflow\| :ref:`Advantage Actor-Critic (A2C, A3C) <a3c>`

			- \|pytorch\| \|tensorflow\| :ref:`Deep Deterministic Policy Gradients (DDPG, TD3) <ddpg>`

			- \|pytorch\| \|tensorflow\| :ref:`Deep Q Networks (DQN, Rainbow, Parametric DQN) <dqn>`

			- \|pytorch\| \|tensorflow\| :ref:`Policy Gradients <pg>`

			- \|pytorch\| \|tensorflow\| :ref:`Proximal Policy Optimization (PPO) <ppo>`

			- \|pytorch\| \|tensorflow\| :ref:`Soft Actor Critic (SAC) <sac>`

			- \|pytorch\| :ref:`Slate Q-Learning (SlateQ) <slateq>`

			`* Derivative-free`

			- \|pytorch\| \|tensorflow\| :ref:`Augmented Random Search (ARS) <ars>`

			- \|pytorch\| \|tensorflow\| :ref:`Evolution Strategies <es>`

			`* Model-based / Meta-learning / Offline`
[RLlib; documentation] "RLlib in 60sec" overhaul. (#20215) 2021-11-10 22:20:06 +01:00
Add Environments to Key Concepts page (#25791) 2022-06-29 16:10:49 -07:00			- \|pytorch\| :ref:`Single-Player AlphaZero (contrib/AlphaZero) <alphazero>`
[docs] RLlib concepts consolidation, user guide, RL conf prep (#22496) 2022-02-18 18:35:20 +01:00
			- \|pytorch\| \|tensorflow\| :ref:`Model-Agnostic Meta-Learning (MAML) <maml>`

			- \|pytorch\| :ref:`Model-Based Meta-Policy-Optimization (MBMPO) <mbmpo>`

Add Environments to Key Concepts page (#25791) 2022-06-29 16:10:49 -07:00			- \|pytorch\| :ref:`Dreamer (DREAMER) <dreamer>`
[docs] RLlib concepts consolidation, user guide, RL conf prep (#22496) 2022-02-18 18:35:20 +01:00
			- \|pytorch\| :ref:`Conservative Q-Learning (CQL) <cql>`

			`* Multi-agent`

			- \|pytorch\| :ref:`QMIX Monotonic Value Factorisation (QMIX, VDN, IQN) <qmix>`
Add Environments to Key Concepts page (#25791) 2022-06-29 16:10:49 -07:00			- \|tensorflow\| :ref:`Multi-Agent Deep Deterministic Policy Gradient (contrib/MADDPG) <maddpg>`
[docs] RLlib concepts consolidation, user guide, RL conf prep (#22496) 2022-02-18 18:35:20 +01:00
			`* Offline`

			- \|pytorch\| \|tensorflow\| :ref:`Advantage Re-Weighted Imitation Learning (MARWIL) <marwil>`

			`* Contextual bandits`

Add Environments to Key Concepts page (#25791) 2022-06-29 16:10:49 -07:00			- \|pytorch\| :ref:`Linear Upper Confidence Bound (contrib/LinUCB) <lin-ucb>`
			- \|pytorch\| :ref:`Linear Thompson Sampling (contrib/LinTS) <lints>`
[docs] RLlib concepts consolidation, user guide, RL conf prep (#22496) 2022-02-18 18:35:20 +01:00
			`* Exploration-based plug-ins (can be combined with any algo)`

			- \|pytorch\| :ref:`Curiosity (ICM: Intrinsic Curiosity Module) <curiosity>`


			`.. dropdown:: RLlib Environments`
			`:animate: fade-in-slide-down`

[docs] RLlib broken links (fixes #23160) (#23226) 2022-03-16 12:38:18 +01:00			* `RLlib Environments Overview <rllib-env.html>`__
			* `OpenAI Gym <rllib-env.html#openai-gym>`__
			* `Vectorized <rllib-env.html#vectorized>`__
			* `Multi-Agent and Hierarchical <rllib-env.html#multi-agent-and-hierarchical>`__
			* `External Agents and Applications <rllib-env.html#external-agents-and-applications>`__
[docs] RLlib concepts consolidation, user guide, RL conf prep (#22496) 2022-02-18 18:35:20 +01:00
[docs] RLlib broken links (fixes #23160) (#23226) 2022-03-16 12:38:18 +01:00			- `External Application Clients <rllib-env.html#external-application-clients>`__
[docs] RLlib concepts consolidation, user guide, RL conf prep (#22496) 2022-02-18 18:35:20 +01:00
[docs] RLlib broken links (fixes #23160) (#23226) 2022-03-16 12:38:18 +01:00			* `Advanced Integrations <rllib-env.html#advanced-integrations>`__
[docs] RLlib concepts consolidation, user guide, RL conf prep (#22496) 2022-02-18 18:35:20 +01:00

			`Customizing RLlib`
			`-----------------`
[RLlib; documentation] "RLlib in 60sec" overhaul. (#20215) 2021-11-10 22:20:06 +01:00
			`RLlib provides simple APIs to customize all aspects of your training- and experimental workflows.`
[docs] RLlib broken links (fixes #23160) (#23226) 2022-03-16 12:38:18 +01:00			For example, you may code your own `environments <rllib-env.html#configuring-environments>`__
[RLlib; documentation] "RLlib in 60sec" overhaul. (#20215) 2021-11-10 22:20:06 +01:00			`in python using openAI's gym or DeepMind's OpenSpiel, provide custom`
[docs] RLlib broken links (fixes #23160) (#23226) 2022-03-16 12:38:18 +01:00			`TensorFlow/Keras- <rllib-models.html#tensorflow-models>`__ or ,
			`Torch models <rllib-models.html#torch-models>`_, write your own
			`policy- and loss definitions <rllib-concepts.html#policies>`__, or define
			custom `exploratory behavior <rllib-training.html#exploration-api>`_.
[RLlib; documentation] "RLlib in 60sec" overhaul. (#20215) 2021-11-10 22:20:06 +01:00
			`Via mapping one or more agents in your environments to (one or more) policies, multi-agent`
			`RL (MARL) becomes an easy-to-use low-level primitive for our users.`

[docs] new structure (#21776) This PR consolidates both #21667 and #21759 (look there for features), but improves on them in the following way: - [x] we reverted renaming of existing projects `tune`, `rllib`, `train`, `cluster`, `serve`, `raysgd` and `data` so that links won't break. I think my consolidation efforts with the `ray-` prefix were a little overeager in that regard. It's better like this. Only the creation of `ray-core` was a necessity, and some files moved into the `rllib` folder, so that should be relatively benign. - [x] Additionally, we added Algolia `docsearch`, screenshot below. This is _much_ better than our current search. Caveat: there's a sphinx dependency that needs to be replaced (`sphinx-tabs`) by another, newer one (`sphinx-panels`), as the former prevents loading of the `algolia.js` library. Will follow-up in the next PR (hoping this one doesn't get re-re-re-re-reverted). 2022-01-22 00:42:05 +01:00			`.. figure:: images/rllib-stack.svg`
[RLlib; documentation] "RLlib in 60sec" overhaul. (#20215) 2021-11-10 22:20:06 +01:00			`:align: left`
[RLlib; Documentation] Minor fixes "rllib in 60s" and per-feature sigils. (#20248) 2021-11-13 22:10:47 +01:00			`:width: 650`
[RLlib; documentation] "RLlib in 60sec" overhaul. (#20215) 2021-11-10 22:20:06 +01:00
			`RLlib's API stack: Built on top of Ray, RLlib offers off-the-shelf, highly distributed`
			`algorithms, policies, loss functions, and default models (including the option to`
			`auto-wrap a neural network with an LSTM or an attention net). Furthermore, our library`
			`comes with a built-in Server/Client setup, allowing you to connect`
			`hundreds of external simulators (clients) via the network to an RLlib server process,`
			`which provides learning functionality and serves action queries. User customizations`
			`are realized via sub-classing the existing abstractions and - by overriding certain`
			`methods in those sub-classes - define custom behavior.`
[docs] RLlib concepts consolidation, user guide, RL conf prep (#22496) 2022-02-18 18:35:20 +01:00

			`.. \|tensorflow\| image:: images/tensorflow.png`
			`:class: inline-figure`
			`:width: 16`

			`.. \|pytorch\| image:: images/pytorch.png`
			`:class: inline-figure`
			`:width: 16`