No description
Find a file
gehring 7c3274e65b [tune] Make the logging of the function API consistent and predictable (#4011)
## What do these changes do?

This is a re-implementation of the `FunctionRunner` which enforces some synchronicity between the thread running the training function and the thread running the Trainable which logs results. The main purpose is to make logging consistent across APIs in anticipation of a new function API which will be generator based (through `yield` statements). Without these changes, it will be impossible for the (possibly soon to be) deprecated reporter based API to behave the same as the generator based API.

This new implementation provides additional guarantees to prevent results from being dropped. This makes the logging behavior more intuitive and consistent with how results are handled in custom subclasses of Trainable.

New guarantees for the tune function API:

- Every reported result, i.e., `reporter(**kwargs)` calls, is forwarded to the appropriate loggers instead of being dropped if not enough time has elapsed since the last results.
- The wrapped function only runs if the `FunctionRunner` expects a result, i.e., when `FunctionRunner._train()` has been called. This removes the possibility that a result will be generated by the function but never logged.
- The wrapped function is not called until the first `_train()` call. Currently, the wrapped function is started during the setup phase which could result in dropped results if the trial is cancelled between `_setup()` and the first `_train()` call.
- Exceptions raised by the wrapped function won't be propagated until all results are logged to prevent dropped results.
- The thread running the wrapped function is explicitly stopped when the `FunctionRunner` is stopped with `_stop()`.
- If the wrapped function terminates without reporting `done=True`, a duplicate result with `{"done": True}`, is reported to explicitly terminate the trial, and components will be notified with a duplicate of the last reported result, but this duplicate will not be logged.

## Related issue number

Closes #3956.
#3949
#3834
2019-03-18 19:14:26 -07:00
.github Direct people to stackoverflow for questions about usage. (#3830) 2019-01-23 13:30:02 -08:00
bazel Fix glog problem of no call stack (#4395) 2019-03-18 18:21:21 -07:00
ci [rllib] Add option to proceed even if some workers crashed (#4376) 2019-03-16 13:34:09 -07:00
cmake/Modules Update arrow version to fix plasma bugs (#4127) 2019-03-08 18:03:58 +08:00
dev Fix release docs (#4225) 2019-03-02 22:01:43 -08:00
doc [tune] Simplify API (#4234) 2019-03-17 13:03:32 -07:00
docker Unpin gym in Python 2 since gym 0.12 was released. (#4291) 2019-03-07 15:59:30 -08:00
examples Move TensorFlowVariables to ray.experimental.tf_utils. (#4145) 2019-02-24 14:26:46 -08:00
java [Java] Upgrade checkstyle plugin (#4375) 2019-03-15 11:36:09 -07:00
kubernetes On Kubernetes, set pod anti-affinity at the host level for pods of type 'ray' (#4131) 2019-03-11 12:57:04 -07:00
python [tune] Make the logging of the function API consistent and predictable (#4011) 2019-03-18 19:14:26 -07:00
site Update Gemfile Jekyll version (#3140) 2018-10-25 21:43:08 -07:00
src/ray Introduce set data structure in GCS (#4199) 2019-03-11 14:42:58 -07:00
thirdparty/scripts Remove the old web UI (#4301) 2019-03-07 23:15:11 -08:00
.clang-format Remove legacy Ray code. (#3121) 2018-10-26 13:36:58 -07:00
.gitignore [Java] Package native dependencies into jar (#4367) 2019-03-15 12:38:40 +08:00
.style.yapf YAPF, take 3 (#2098) 2018-05-19 16:07:28 -07:00
.travis.yml Run only relevant tests in Travis based on git diff. (#4271) 2019-03-15 22:23:54 -07:00
build-docker.sh adding -x flag for better debugging during builds (#1079) 2017-10-04 13:56:14 -07:00
BUILD.bazel Remove the old web UI (#4301) 2019-03-07 23:15:11 -08:00
build.sh Make sure the right Python interpreter is used (#4334) 2019-03-12 12:21:55 -07:00
CMakeLists.txt [Java] Package native dependencies into jar (#4367) 2019-03-15 12:38:40 +08:00
CONTRIBUTING.rst Direct people to stackoverflow for questions about usage. (#3830) 2019-01-23 13:30:02 -08:00
LICENSE [rllib] add augmented random search (#2714) 2018-08-24 22:20:02 -07:00
pylintrc adding pylint (#233) 2016-07-08 12:39:11 -07:00
README.rst Update version to 0.7.0.dev1 and update docs 0.6.3 -> 0.6.4 (#4276) 2019-03-05 22:22:29 -08:00
scripts Lint script link broken, also lint filter was broken for generated py files (#4133) 2019-02-22 17:33:08 -08:00
setup_thirdparty.sh update ray cmake build process (#2853) 2018-09-12 11:19:33 -07:00
WORKSPACE Fix glog problem of no call stack (#4395) 2019-03-18 18:21:21 -07:00

.. image:: https://github.com/ray-project/ray/raw/master/doc/source/images/ray_header_logo.png

.. image:: https://travis-ci.com/ray-project/ray.svg?branch=master
    :target: https://travis-ci.com/ray-project/ray

.. image:: https://readthedocs.org/projects/ray/badge/?version=latest
    :target: http://ray.readthedocs.io/en/latest/?badge=latest

.. image:: https://img.shields.io/badge/pypi-0.6.4-blue.svg
    :target: https://pypi.org/project/ray/

|

**Ray is a flexible, high-performance distributed execution framework.**


Ray is easy to install: ``pip install ray``

Example Use
-----------

+------------------------------------------------+----------------------------------------------------+
| **Basic Python**                               | **Distributed with Ray**                           |
+------------------------------------------------+----------------------------------------------------+
|.. code-block:: python                          |.. code-block:: python                              |
|                                                |                                                    |
|  # Execute f serially.                         |  # Execute f in parallel.                          |
|                                                |                                                    |
|                                                |  @ray.remote                                       |
|  def f():                                      |  def f():                                          |
|      time.sleep(1)                             |      time.sleep(1)                                 |
|      return 1                                  |      return 1                                      |
|                                                |                                                    |
|                                                |                                                    |
|                                                |  ray.init()                                        |
|  results = [f() for i in range(4)]             |  results = ray.get([f.remote() for i in range(4)]) |
+------------------------------------------------+----------------------------------------------------+


Ray comes with libraries that accelerate deep learning and reinforcement learning development:

- `Tune`_: Hyperparameter Optimization Framework
- `RLlib`_: Scalable Reinforcement Learning
- `Distributed Training <http://ray.readthedocs.io/en/latest/distributed_sgd.html>`__

.. _`Tune`: http://ray.readthedocs.io/en/latest/tune.html
.. _`RLlib`: http://ray.readthedocs.io/en/latest/rllib.html

Installation
------------

Ray can be installed on Linux and Mac with ``pip install ray``.

To build Ray from source or to install the nightly versions, see the `installation documentation`_.

.. _`installation documentation`: http://ray.readthedocs.io/en/latest/installation.html

More Information
----------------

- `Documentation`_
- `Tutorial`_
- `Blog`_
- `Ray paper`_
- `Ray HotOS paper`_

.. _`Documentation`: http://ray.readthedocs.io/en/latest/index.html
.. _`Tutorial`: https://github.com/ray-project/tutorial
.. _`Blog`: https://ray-project.github.io/
.. _`Ray paper`: https://arxiv.org/abs/1712.05889
.. _`Ray HotOS paper`: https://arxiv.org/abs/1703.03924

Getting Involved
----------------

- `ray-dev@googlegroups.com`_: For discussions about development or any general
  questions.
- `StackOverflow`_: For questions about how to use Ray.
- `GitHub Issues`_: For reporting bugs and feature requests.
- `Pull Requests`_: For submitting code contributions.

.. _`ray-dev@googlegroups.com`: https://groups.google.com/forum/#!forum/ray-dev
.. _`GitHub Issues`: https://github.com/ray-project/ray/issues
.. _`StackOverflow`: https://stackoverflow.com/questions/tagged/ray
.. _`Pull Requests`: https://github.com/ray-project/ray/pulls