mirror of
https://github.com/vale981/ray
synced 2025-03-09 21:06:39 -04:00

* Initial draft of 0.3 release blog post. * Small tweaks. * Update date * Updates based on comments * Update date * Update date. * Fixes
147 lines
4.9 KiB
Markdown
147 lines
4.9 KiB
Markdown
---
|
||
layout: post
|
||
title: "Ray: 0.3 Release"
|
||
excerpt: "This post announces the release of Ray 0.3."
|
||
date: 2017-11-30 14:00:00
|
||
---
|
||
|
||
We are pleased to announce the Ray 0.3 release. This release introduces
|
||
[**distributed actor handles**][1] and [**Ray.tune**][2] --- a new
|
||
hyperparameter search library. It also includes a number of bug fixes and
|
||
stability enhancements.
|
||
|
||
To upgrade to the latest version, run
|
||
|
||
```
|
||
pip install -U ray
|
||
```
|
||
|
||
## Hyperparameter Search Tool
|
||
|
||
This release adds [**Ray.tune**][2], a distributed hyperparameter evaluation
|
||
tool for long-running tasks such as reinforcement learning and deep learning
|
||
training. It currently includes the following features:
|
||
|
||
- Pluggable **early stopping algorithms** including the Median Stopping Rule and
|
||
[Hyperband][5].
|
||
- Integration with **visualization tools** such as [TensorBoard][6],
|
||
[rllab's VisKit][7], and a [parallel coordinates visualization][8].
|
||
- Flexible **trial variant generation**, including grid search, random search,
|
||
and conditional parameter distributions.
|
||
- **Resource-aware scheduling**, including support for concurrent runs of
|
||
algorithms that require GPUs or are themselves parallel and distributed.
|
||
|
||
Ray.tune provides a Python API for use with deep learning and other compute
|
||
intensive training tasks. Here is a toy example illustrating usage.
|
||
|
||
```python
|
||
from ray.tune import register_trainable, grid_search, run_experiments
|
||
|
||
def my_func(config, reporter):
|
||
import time, numpy as np
|
||
i = 0
|
||
while True:
|
||
reporter(timesteps_total=i, mean_accuracy=i ** config['alpha'])
|
||
|
||
i += config['beta']
|
||
time.sleep(0.01)
|
||
|
||
register_trainable('my_func', my_func)
|
||
|
||
run_experiments({
|
||
'my_experiment': {
|
||
'run': 'my_func',
|
||
'resources': {'cpu': 1, 'gpu': 0},
|
||
'stop': {'mean_accuracy': 100},
|
||
'config': {
|
||
'alpha': grid_search([0.2, 0.4, 0.6]),
|
||
'beta': grid_search([1, 2]),
|
||
},
|
||
}
|
||
})
|
||
```
|
||
|
||
In-progress results can be visualized live using tools such as Tensorboard and
|
||
rllab's VisKit (or you can read the JSON format logs directly from the driver
|
||
node):
|
||
|
||
View the [documentation][2] and the [code][9].
|
||
|
||
### Ray.tune integration with RLlib
|
||
|
||
You can try out RLlib with Ray.tune with the following example.
|
||
|
||
```bash
|
||
cd ray/python/ray/rllib
|
||
python train.py -f tuned_examples/cartpole-grid-search-example.yaml
|
||
```
|
||
|
||
The `tuned_examples` directory also contains pre-tuned hyperparameter
|
||
configurations for common benchmark tasks such as Pong and Humanoid. View the
|
||
[RLlib documentation][3].
|
||
|
||
### Initial Support for PyTorch in RLlib
|
||
|
||
A modern reinforcement learning library should work with multiple deep learning
|
||
frameworks. As a step toward this goal, 0.3 adds support for PyTorch models for
|
||
A3C in RLlib. You can try this out with the following A3C config.
|
||
|
||
```bash
|
||
cd ray/python/ray/rllib
|
||
./train.py --run=A3C \
|
||
--env=PongDeterministic-v4 \
|
||
--config='{"use_pytorch": true, "num_workers": 8, "use_lstm": false, "model": {"grayscale": true, "zero_mean": false, "dim": 80, "channel_major": true}}'
|
||
```
|
||
|
||
## Distributed Actor Handles
|
||
|
||
Ray 0.3 adds support for [distributed actor handles][1], that is, the ability to
|
||
have multiple callers invoke methods on the same actor. The actor’s creator may
|
||
pass the actor handle as an argument to other tasks or to other actor methods.
|
||
Here’s an example in which the driver creates an actor to log messages and
|
||
passes an actor handle to other tasks:
|
||
|
||
```python
|
||
import ray
|
||
|
||
ray.init()
|
||
|
||
@ray.remote
|
||
class Logger(object):
|
||
def __init__(self):
|
||
self.logs = []
|
||
def log(self, log_msg):
|
||
self.logs.append(log_msg)
|
||
def read_logs(self):
|
||
return self.logs
|
||
|
||
@ray.remote
|
||
def task(logger, task_index):
|
||
# Do some work.
|
||
logged = logger.log.remote('Task {} is done'.format(task_index))
|
||
|
||
# Create an actor and get a handle to it.
|
||
logger = Logger.remote()
|
||
# Pass the actor handle to some tasks.
|
||
tasks = [task.remote(logger, i) for i in range(10)]
|
||
# Wait for the tasks to finish.
|
||
ray.get(tasks)
|
||
# Get the logs from the tasks.
|
||
print(ray.get(logger.read_logs.remote()))
|
||
```
|
||
|
||
This feature is still considered experimental, but we've already found
|
||
distributed actor handles useful for implementing [**parameter server**][10] and
|
||
[**streaming MapReduce**][11] applications.
|
||
|
||
[1]: http://ray.readthedocs.io/en/latest/actors.html#passing-around-actor-handles-experimental
|
||
[2]: http://ray.readthedocs.io/en/latest/tune.html
|
||
[3]: http://ray.readthedocs.io/en/latest/rllib.html
|
||
[4]: https://research.google.com/pubs/pub46180.html
|
||
[5]: https://arxiv.org/abs/1603.06560
|
||
[6]: https://www.tensorflow.org/get_started/summaries_and_tensorboard
|
||
[7]: https://media.readthedocs.org/pdf/rllab/latest/rllab.pdf
|
||
[8]: https://en.wikipedia.org/wiki/Parallel_coordinates
|
||
[9]: https://github.com/ray-project/ray/tree/master/python/ray/tune
|
||
[10]: http://ray.readthedocs.io/en/latest/example-parameter-server.html
|
||
[11]: http://ray.readthedocs.io/en/latest/example-streaming.html
|