2017-11-30 16:24:34 -08:00
|
|
|
---
|
|
|
|
layout: post
|
|
|
|
title: "Ray: 0.3 Release"
|
|
|
|
excerpt: "This post announces the release of Ray 0.3."
|
|
|
|
date: 2017-11-30 14:00:00
|
|
|
|
---
|
|
|
|
|
|
|
|
We are pleased to announce the Ray 0.3 release. This release introduces
|
|
|
|
[**distributed actor handles**][1] and [**Ray.tune**][2] --- a new
|
|
|
|
hyperparameter search library. It also includes a number of bug fixes and
|
|
|
|
stability enhancements.
|
|
|
|
|
|
|
|
To upgrade to the latest version, run
|
|
|
|
|
|
|
|
```
|
|
|
|
pip install -U ray
|
|
|
|
```
|
|
|
|
|
|
|
|
## Hyperparameter Search Tool
|
|
|
|
|
|
|
|
This release adds [**Ray.tune**][2], a distributed hyperparameter evaluation
|
|
|
|
tool for long-running tasks such as reinforcement learning and deep learning
|
|
|
|
training. It currently includes the following features:
|
|
|
|
|
|
|
|
- Pluggable **early stopping algorithms** including the Median Stopping Rule and
|
|
|
|
[Hyperband][5].
|
|
|
|
- Integration with **visualization tools** such as [TensorBoard][6],
|
|
|
|
[rllab's VisKit][7], and a [parallel coordinates visualization][8].
|
|
|
|
- Flexible **trial variant generation**, including grid search, random search,
|
|
|
|
and conditional parameter distributions.
|
|
|
|
- **Resource-aware scheduling**, including support for concurrent runs of
|
|
|
|
algorithms that require GPUs or are themselves parallel and distributed.
|
|
|
|
|
|
|
|
Ray.tune provides a Python API for use with deep learning and other compute
|
|
|
|
intensive training tasks. Here is a toy example illustrating usage.
|
|
|
|
|
|
|
|
```python
|
|
|
|
from ray.tune import register_trainable, grid_search, run_experiments
|
|
|
|
|
|
|
|
def my_func(config, reporter):
|
|
|
|
import time, numpy as np
|
|
|
|
i = 0
|
|
|
|
while True:
|
|
|
|
reporter(timesteps_total=i, mean_accuracy=i ** config['alpha'])
|
|
|
|
|
|
|
|
i += config['beta']
|
|
|
|
time.sleep(0.01)
|
|
|
|
|
|
|
|
register_trainable('my_func', my_func)
|
|
|
|
|
|
|
|
run_experiments({
|
|
|
|
'my_experiment': {
|
|
|
|
'run': 'my_func',
|
|
|
|
'resources': {'cpu': 1, 'gpu': 0},
|
|
|
|
'stop': {'mean_accuracy': 100},
|
|
|
|
'config': {
|
|
|
|
'alpha': grid_search([0.2, 0.4, 0.6]),
|
|
|
|
'beta': grid_search([1, 2]),
|
|
|
|
},
|
|
|
|
}
|
|
|
|
})
|
|
|
|
```
|
|
|
|
|
|
|
|
In-progress results can be visualized live using tools such as Tensorboard and
|
|
|
|
rllab's VisKit (or you can read the JSON format logs directly from the driver
|
|
|
|
node):
|
|
|
|
|
|
|
|
View the [documentation][2] and the [code][9].
|
|
|
|
|
|
|
|
### Ray.tune integration with RLlib
|
|
|
|
|
|
|
|
You can try out RLlib with Ray.tune with the following example.
|
|
|
|
|
|
|
|
```bash
|
|
|
|
cd ray/python/ray/rllib
|
|
|
|
python train.py -f tuned_examples/cartpole-grid-search-example.yaml
|
|
|
|
```
|
|
|
|
|
|
|
|
The `tuned_examples` directory also contains pre-tuned hyperparameter
|
|
|
|
configurations for common benchmark tasks such as Pong and Humanoid. View the
|
|
|
|
[RLlib documentation][3].
|
|
|
|
|
|
|
|
### Initial Support for PyTorch in RLlib
|
|
|
|
|
|
|
|
A modern reinforcement learning library should work with multiple deep learning
|
|
|
|
frameworks. As a step toward this goal, 0.3 adds support for PyTorch models for
|
|
|
|
A3C in RLlib. You can try this out with the following A3C config.
|
|
|
|
|
|
|
|
```bash
|
|
|
|
cd ray/python/ray/rllib
|
|
|
|
./train.py --run=A3C \
|
|
|
|
--env=PongDeterministic-v4 \
|
|
|
|
--config='{"use_pytorch": true, "num_workers": 8, "use_lstm": false, "model": {"grayscale": true, "zero_mean": false, "dim": 80, "channel_major": true}}'
|
|
|
|
```
|
|
|
|
|
|
|
|
## Distributed Actor Handles
|
|
|
|
|
|
|
|
Ray 0.3 adds support for [distributed actor handles][1], that is, the ability to
|
2018-03-10 20:36:01 -08:00
|
|
|
have multiple callers invoke methods on the same actor. The actor's creator may
|
2017-11-30 16:24:34 -08:00
|
|
|
pass the actor handle as an argument to other tasks or to other actor methods.
|
2018-03-10 20:36:01 -08:00
|
|
|
Here's an example in which the driver creates an actor to log messages and
|
2017-11-30 16:24:34 -08:00
|
|
|
passes an actor handle to other tasks:
|
|
|
|
|
|
|
|
```python
|
|
|
|
import ray
|
|
|
|
|
|
|
|
ray.init()
|
|
|
|
|
|
|
|
@ray.remote
|
|
|
|
class Logger(object):
|
|
|
|
def __init__(self):
|
|
|
|
self.logs = []
|
|
|
|
def log(self, log_msg):
|
|
|
|
self.logs.append(log_msg)
|
|
|
|
def read_logs(self):
|
|
|
|
return self.logs
|
|
|
|
|
|
|
|
@ray.remote
|
|
|
|
def task(logger, task_index):
|
|
|
|
# Do some work.
|
|
|
|
logged = logger.log.remote('Task {} is done'.format(task_index))
|
|
|
|
|
|
|
|
# Create an actor and get a handle to it.
|
|
|
|
logger = Logger.remote()
|
|
|
|
# Pass the actor handle to some tasks.
|
|
|
|
tasks = [task.remote(logger, i) for i in range(10)]
|
|
|
|
# Wait for the tasks to finish.
|
|
|
|
ray.get(tasks)
|
|
|
|
# Get the logs from the tasks.
|
|
|
|
print(ray.get(logger.read_logs.remote()))
|
|
|
|
```
|
|
|
|
|
|
|
|
This feature is still considered experimental, but we've already found
|
|
|
|
distributed actor handles useful for implementing [**parameter server**][10] and
|
|
|
|
[**streaming MapReduce**][11] applications.
|
|
|
|
|
2020-11-10 10:53:28 -08:00
|
|
|
[1]: http://docs.ray.io/en/master/actors.html#passing-around-actor-handles-experimental
|
|
|
|
[2]: http://docs.ray.io/en/master/tune.html
|
|
|
|
[3]: http://docs.ray.io/en/master/rllib.html
|
2017-11-30 16:24:34 -08:00
|
|
|
[4]: https://research.google.com/pubs/pub46180.html
|
|
|
|
[5]: https://arxiv.org/abs/1603.06560
|
|
|
|
[6]: https://www.tensorflow.org/get_started/summaries_and_tensorboard
|
|
|
|
[7]: https://media.readthedocs.org/pdf/rllab/latest/rllab.pdf
|
|
|
|
[8]: https://en.wikipedia.org/wiki/Parallel_coordinates
|
|
|
|
[9]: https://github.com/ray-project/ray/tree/master/python/ray/tune
|
2020-11-10 10:53:28 -08:00
|
|
|
[10]: http://docs.ray.io/en/master/example-parameter-server.html
|
|
|
|
[11]: http://docs.ray.io/en/master/example-streaming.html
|