No description
Find a file
Eric Liang 7dee2c6735
[rllib] Envs for vectorized execution, async execution, and policy serving (#2170)
## What do these changes do?

**Vectorized envs**: Users can either implement `VectorEnv`, or alternatively set `num_envs=N` to auto-vectorize gym envs (this vectorizes just the action computation part).

```
# CartPole-v0 on single core with 64x64 MLP:

# vector_width=1:
Actions per second 2720.1284458322966

# vector_width=8:
Actions per second 13773.035334888269

# vector_width=64:
Actions per second 37903.20472563333
```

**Async envs**: The more general form of `VectorEnv` is `AsyncVectorEnv`, which allows agents to execute out of lockstep. We use this as an adapter to support `ServingEnv`. Since we can convert any other form of env to `AsyncVectorEnv`, utils.sampler has been rewritten to run against this interface.

**Policy serving**: This provides an env which is not stepped. Rather, the env executes in its own thread, querying the policy for actions via `self.get_action(obs)`, and reporting results via `self.log_returns(rewards)`. We also support logging of off-policy actions via `self.log_action(obs, action)`. This is a more convenient API for some use cases, and also provides parallelizable support for policy serving (for example, if you start a HTTP server in the env) and ingest of offline logs (if the env reads from serving logs).

Any of these types of envs can be passed to RLlib agents. RLlib handles conversions internally in CommonPolicyEvaluator, for example:
 ```
        gym.Env => rllib.VectorEnv => rllib.AsyncVectorEnv
        rllib.ServingEnv => rllib.AsyncVectorEnv
```
2018-06-18 11:55:32 -07:00
.github Add docs for contributors. (#1191) 2017-11-10 00:40:19 -08:00
.travis Fix yapf excludes, print diff in --all mode (#2211) 2018-06-08 02:25:55 -07:00
cmake/Modules add facility to link libraries to tests (#1850) 2018-04-09 18:59:24 -07:00
doc [autoscaler] GCP docs (#2235) 2018-06-12 12:40:12 -07:00
docker [rllib] Fix A3C PyTorch implementation (#2036) 2018-05-30 10:48:11 -07:00
examples Use flake8-comprehensions (#1976) 2018-05-20 16:15:06 -07:00
java [Java] Replace binary rewrite with Remote Lambda Cache (SerdeLambda) (#2245) 2018-06-13 12:58:07 -07:00
python [rllib] Envs for vectorized execution, async execution, and policy serving (#2170) 2018-06-18 11:55:32 -07:00
site Add 0.4 release blog post. (#1794) 2018-04-02 00:23:56 -07:00
src [xray] support multi-workers per process (#2244) 2018-06-13 10:14:05 -07:00
test [rllib] Envs for vectorized execution, async execution, and policy serving (#2170) 2018-06-18 11:55:32 -07:00
thirdparty/scripts Fix build error when building Ray for Java later than Python (#2241) 2018-06-12 21:11:30 -07:00
vsprojects Windows compatibility (#57) 2016-11-22 17:04:24 -08:00
.clang-format Implement object table notification subscriptions and switch to using Redis modules for object table. (#134) 2016-12-18 18:19:02 -08:00
.gitignore [Java] Replace binary rewrite with Remote Lambda Cache (SerdeLambda) (#2245) 2018-06-13 12:58:07 -07:00
.style.yapf YAPF, take 3 (#2098) 2018-05-19 16:07:28 -07:00
.travis.yml Improve yapf speed and document its usage (#2160) 2018-06-05 20:22:11 -07:00
build-docker.sh adding -x flag for better debugging during builds (#1079) 2017-10-04 13:56:14 -07:00
build.sh unify build dir for Python and Java (#2171) 2018-06-01 16:28:27 -07:00
CMakeLists.txt fix redis module build dependencies (#2247) 2018-06-13 10:18:09 -07:00
CONTRIBUTING.rst Replace special single quote with regular single quote. (#1693) 2018-03-10 20:36:01 -08:00
LICENSE [rllib] Basic port of baselines/deepq to rllib (#709) 2017-07-07 18:37:00 +00:00
pylintrc adding pylint (#233) 2016-07-08 12:39:11 -07:00
Ray.sln Windows compatibility (#57) 2016-11-22 17:04:24 -08:00
README.rst Update Travis CI badge from travis-ci.org to travis-ci.com. (#2155) 2018-05-29 16:44:02 -07:00
scripts Improve yapf speed and document its usage (#2160) 2018-06-05 20:22:11 -07:00
setup_thirdparty.sh [JavaWorker] Enable java worker support (#2094) 2018-05-26 14:38:50 -07:00

Ray
===

.. image:: https://travis-ci.com/ray-project/ray.svg?branch=master
    :target: https://travis-ci.com/ray-project/ray

.. image:: https://readthedocs.org/projects/ray/badge/?version=latest
    :target: http://ray.readthedocs.io/en/latest/?badge=latest

|

Ray is a flexible, high-performance distributed execution framework.


Ray is easy to install: ``pip install ray``

Example Use
-----------

+------------------------------------------------+----------------------------------------------------+
| **Basic Python**                               | **Distributed with Ray**                           |
+------------------------------------------------+----------------------------------------------------+
|.. code-block:: python                          |.. code-block:: python                              |
|                                                |                                                    |
|  # Execute f serially.                         |  # Execute f in parallel.                          |
|                                                |                                                    |
|                                                |  @ray.remote                                       |
|  def f():                                      |  def f():                                          |
|      time.sleep(1)                             |      time.sleep(1)                                 |
|      return 1                                  |      return 1                                      |
|                                                |                                                    |
|                                                |                                                    |
|                                                |  ray.init()                                        |
|  results = [f() for i in range(4)]             |  results = ray.get([f.remote() for i in range(4)]) |
+------------------------------------------------+----------------------------------------------------+


Ray comes with libraries that accelerate deep learning and reinforcement learning development:

- `Ray Tune`_: Hyperparameter Optimization Framework
- `Ray RLlib`_: Scalable Reinforcement Learning

.. _`Ray Tune`: http://ray.readthedocs.io/en/latest/tune.html
.. _`Ray RLlib`: http://ray.readthedocs.io/en/latest/rllib.html

Installation
------------

Ray can be installed on Linux and Mac with ``pip install ray``.

To build Ray from source or to install the nightly versions, see the `installation documentation`_.

.. _`installation documentation`: http://ray.readthedocs.io/en/latest/installation.html

More Information
----------------

- `Documentation`_
- `Tutorial`_
- `Blog`_
- `Ray paper`_
- `Ray HotOS paper`_

.. _`Documentation`: http://ray.readthedocs.io/en/latest/index.html
.. _`Tutorial`: https://github.com/ray-project/tutorial
.. _`Blog`: https://ray-project.github.io/
.. _`Ray paper`: https://arxiv.org/abs/1712.05889
.. _`Ray HotOS paper`: https://arxiv.org/abs/1703.03924

Getting Involved
----------------

- Ask questions on our mailing list `ray-dev@googlegroups.com`_.
- Please report bugs by submitting a `GitHub issue`_.
- Submit contributions using `pull requests`_.

.. _`ray-dev@googlegroups.com`: https://groups.google.com/forum/#!forum/ray-dev
.. _`GitHub issue`: https://github.com/ray-project/ray/issues
.. _`pull requests`: https://github.com/ray-project/ray/pulls