2021-11-19 10:06:40 +01:00
.. _env-reference-docs:
2021-11-25 09:35:19 +01:00
Environments
============
2021-11-19 10:06:40 +01:00
2021-11-25 09:35:19 +01:00
Any environment type provided by you to RLlib (e.g. a user-defined `gym.Env <https://github.com/openai/gym> `_ class),
is converted internally into the :py:class: `~ray.rllib.env.base_env.BaseEnv` API, whose main methods are `` poll() `` and `` send_actions() `` :
2021-11-19 10:06:40 +01:00
.. https://docs.google.com/drawings/d/1NtbVk-Mo89liTRx-sHu_7fqi3Kn7Hjdf3i6jIMbxGlY/edit
[docs] new structure (#21776)
This PR consolidates both #21667 and #21759 (look there for features), but improves on them in the following way:
- [x] we reverted renaming of existing projects `tune`, `rllib`, `train`, `cluster`, `serve`, `raysgd` and `data` so that links won't break. I think my consolidation efforts with the `ray-` prefix were a little overeager in that regard. It's better like this. Only the creation of `ray-core` was a necessity, and some files moved into the `rllib` folder, so that should be relatively benign.
- [x] Additionally, we added Algolia `docsearch`, screenshot below. This is _much_ better than our current search. Caveat: there's a sphinx dependency that needs to be replaced (`sphinx-tabs`) by another, newer one (`sphinx-panels`), as the former prevents loading of the `algolia.js` library. Will follow-up in the next PR (hoping this one doesn't get re-re-re-re-reverted).
2022-01-22 00:42:05 +01:00
.. image :: ../images/env_classes_overview.svg
2021-11-19 10:06:40 +01:00
2021-11-25 09:35:19 +01:00
The :py:class: `~ray.rllib.env.base_env.BaseEnv` API allows RLlib to support:
2021-11-19 10:06:40 +01:00
2021-11-25 09:35:19 +01:00
1) Vectorization of sub-environments (i.e. individual `gym.Env <https://github.com/openai/gym> `_ instances, stacked to form a vector of envs) in order to batch the action computing model forward passes.
2021-11-19 10:06:40 +01:00
2) External simulators requiring async execution (e.g. envs that run on separate machines and independently request actions from a policy server).
2021-11-25 09:35:19 +01:00
3) Stepping through the individual sub-environments in parallel via pre-converting them into separate `@ray.remote` actors.
2021-11-19 10:06:40 +01:00
4) Multi-agent RL via dicts mapping agent IDs to observations/rewards/etc..
2021-11-25 09:35:19 +01:00
For example, if you provide a custom `gym.Env <https://github.com/openai/gym> `_ class to RLlib, auto-conversion to :py:class: `~ray.rllib.env.base_env.BaseEnv` goes as follows:
2021-11-19 10:06:40 +01:00
2021-11-25 09:35:19 +01:00
- User provides a `gym.Env <https://github.com/openai/gym> `_ -> :py:class: `~ray.rllib.env.vector_env._VectorizedGymEnv` (is-a :py:class: `~ray.rllib.env.vector_env.VectorEnv` ) -> :py:class: `~ray.rllib.env.base_env.BaseEnv`
2021-11-19 10:06:40 +01:00
Here is a simple example:
.. literalinclude :: ../../../../rllib/examples/documentation/custom_gym_env.py
:language: python
.. start-after: __sphinx_doc_model_construct_1_begin__
.. end-before: __sphinx_doc_model_construct_1_end__
However, you may also conveniently sub-class any of the other supported RLlib-specific
environment types. The automated paths from those env types (or callables returning instances of those types) to
2021-11-25 09:35:19 +01:00
an RLlib :py:class: `~ray.rllib.env.base_env.BaseEnv` is as follows:
- User provides a custom :py:class: `~ray.rllib.env.multi_agent_env.MultiAgentEnv` (is-a `gym.Env <https://github.com/openai/gym> `_ ) -> :py:class: `~ray.rllib.env.vector_env.VectorEnv` -> :py:class: `~ray.rllib.env.base_env.BaseEnv`
- User uses a policy client (via an external simulator) -> :py:class: `~ray.rllib.env.external_env.ExternalEnv` | :py:class: `~ray.rllib.env.external_multi_agent_env.ExternalMultiAgentEnv` -> :py:class: `~ray.rllib.env.base_env.BaseEnv`
- User provides a custom :py:class: `~ray.rllib.env.vector_env.VectorEnv` -> :py:class: `~ray.rllib.env.base_env.BaseEnv`
- User provides a custom :py:class: `~ray.rllib.env.base_env.BaseEnv` -> do nothing
Environment API Reference
-------------------------
.. toctree ::
:maxdepth: 1
env/base_env.rst
env/multi_agent_env.rst
env/vector_env.rst
env/external_env.rst
2021-11-19 10:06:40 +01:00