ray/doc/source/using-ray-on-a-cluster.md

# Using Ray on a cluster

Deploying Ray on a cluster currently requires a bit of manual work.

## Deploying Ray on a cluster.

This section assumes that you have a cluster running and that the node in the
cluster can communicate with each other. It also assumes that Ray is installed
on each machine. To install Ray, follow the instructions for [installation on
Ubuntu](install-on-ubuntu.md).

### Starting Ray on each machine.

On the head node (just choose some node to be the head node), run the following,
replacing `<redis-port>` with a port of your choice, e.g., `6379`.

```
./ray/scripts/start_ray.sh --head --redis-port <redis-port>
```

The `--redis-port` arugment is optional, and if not provided Ray starts Redis
on a port selected at random.
In either case, the command will print out the address of the Redis server
that was started (and some other address information).

Then on all of the other nodes, run the following. Make sure to replace
`<redis-address>` with the value printed by the command on the head node (it
should look something like `123.45.67.89:6379`).

```
./ray/scripts/start_ray.sh --redis-address <redis-address>
```

To specify the number of processes to start, use the flag `--num-workers`, as
follows:

```
./ray/scripts/start_ray.sh --num-workers <int>
```

Now we've started all of the Ray processes on each node Ray. This includes

- Some worker processes on each machine.
- An object store on each machine.
- A local scheduler on each machine.
- One Redis server (on the head node).
- One global scheduler (on the head node).
- Optionally, this may start up some processes for visualizing the system state
  through a web UI.

To run some commands, start up Python on one of the nodes in the cluster, and do
the following.

```python
import ray
ray.init(redis_address="<redis-address>")
```

Now you can define remote functions and execute tasks. For example:

```python
@ray.remote
def f(x):
  return x

ray.get([f.remote(f.remote(f.remote(0))) for _ in range(1000)])
```

### Stopping Ray

When you want to stop the Ray processes, run `./ray/scripts/stop_ray.sh`
on each node.

### Using the Web UI on a Cluster

If you followed the instructions for setting up the web UI, then
`./ray/scripts/start_ray.sh --head` will attempt to start a Python 3 webserver
on the head node. In order to view the web UI from a machine that is not part of
the cluster (like your laptop), you can use SSH port forwarding. The web UI
requires ports 8080 and 8888, which you can forward using a command like the
following.

```
ssh -L 8080:localhost:8080 -L 8888:localhost:8888 ubuntu@<head-node-public-ip>
```

Then you can view the web UI on your laptop by navigating to
`http://localhost:8080` in a browser.

#### Troubleshooting the Web UI

Note that to use the web UI, additional setup is required. In particular, you
must be using Python 3 or you must at least have `python3` on your path.

If the web UI doesn't work, it's possible that the web UI processes were never
started (check `ps aux | grep polymer` and `ps aux | grep ray_ui.py`). If they
were started, see if you can fetch the web UI from within the head node (try
`wget http://localhost:8080` on the head node).
update tutorial (#318) 2016-07-28 20:47:37 -07:00			`# Using Ray on a cluster`
basic tutorials (#204) 2016-07-06 13:51:32 -07:00
Documentation for using Ray on a cluster. (#165) 2016-12-30 00:29:03 -08:00			`Deploying Ray on a cluster currently requires a bit of manual work.`

			`## Deploying Ray on a cluster.`

			`This section assumes that you have a cluster running and that the node in the`
			`cluster can communicate with each other. It also assumes that Ray is installed`
			`on each machine. To install Ray, follow the instructions for [installation on`
			`Ubuntu](install-on-ubuntu.md).`

			`### Starting Ray on each machine.`

Add Redis port option to startup script (#232) * specify redis address when starting head * cleanup * update starting cluster documentation * Whitespace. * Address Philipp's comments. * Change redis_host -> redis_ip_address. 2017-01-31 00:28:00 -08:00			`On the head node (just choose some node to be the head node), run the following,`
			replacing `<redis-port>` with a port of your choice, e.g., `6379`.
Documentation for using Ray on a cluster. (#165) 2016-12-30 00:29:03 -08:00
			```
Add Redis port option to startup script (#232) * specify redis address when starting head * cleanup * update starting cluster documentation * Whitespace. * Address Philipp's comments. * Change redis_host -> redis_ip_address. 2017-01-31 00:28:00 -08:00			`./ray/scripts/start_ray.sh --head --redis-port <redis-port>`
Documentation for using Ray on a cluster. (#165) 2016-12-30 00:29:03 -08:00			```

Add Redis port option to startup script (#232) * specify redis address when starting head * cleanup * update starting cluster documentation * Whitespace. * Address Philipp's comments. * Change redis_host -> redis_ip_address. 2017-01-31 00:28:00 -08:00			The `--redis-port` arugment is optional, and if not provided Ray starts Redis
			`on a port selected at random.`
			`In either case, the command will print out the address of the Redis server`
			`that was started (and some other address information).`
Documentation for using Ray on a cluster. (#165) 2016-12-30 00:29:03 -08:00
			`Then on all of the other nodes, run the following. Make sure to replace`
			`<redis-address>` with the value printed by the command on the head node (it
Add Redis port option to startup script (#232) * specify redis address when starting head * cleanup * update starting cluster documentation * Whitespace. * Address Philipp's comments. * Change redis_host -> redis_ip_address. 2017-01-31 00:28:00 -08:00			should look something like `123.45.67.89:6379`).
Documentation for using Ray on a cluster. (#165) 2016-12-30 00:29:03 -08:00
			```
			`./ray/scripts/start_ray.sh --redis-address <redis-address>`
			```
Improvements to documentation and error messages. (#221) 2017-01-19 20:27:46 -08:00
			To specify the number of processes to start, use the flag `--num-workers`, as
			`follows:`

updated cluster documentation (#216) 2017-01-19 13:59:54 -08:00			```
			`./ray/scripts/start_ray.sh --num-workers <int>`
			```
Improvements to documentation and error messages. (#221) 2017-01-19 20:27:46 -08:00
Documentation for using Ray on a cluster. (#165) 2016-12-30 00:29:03 -08:00			`Now we've started all of the Ray processes on each node Ray. This includes`

			`- Some worker processes on each machine.`
			`- An object store on each machine.`
			`- A local scheduler on each machine.`
			`- One Redis server (on the head node).`
			`- One global scheduler (on the head node).`
Attempt to start web UI when starting Ray. (#269) * Attempt to start web UI when starting Ray. * Add instructions for using web UI to cluster documentation. * Don't check if port 8080 is open. * Remove print statement. 2017-02-12 15:17:58 -08:00			`- Optionally, this may start up some processes for visualizing the system state`
			`through a web UI.`
Documentation for using Ray on a cluster. (#165) 2016-12-30 00:29:03 -08:00
Improvements to documentation and error messages. (#221) 2017-01-19 20:27:46 -08:00			`To run some commands, start up Python on one of the nodes in the cluster, and do`
			`the following.`
Documentation for using Ray on a cluster. (#165) 2016-12-30 00:29:03 -08:00
			```python
			`import ray`
			`ray.init(redis_address="<redis-address>")`
			```

			`Now you can define remote functions and execute tasks. For example:`

			```python
			`@ray.remote`
			`def f(x):`
			`return x`

			`ray.get([f.remote(f.remote(f.remote(0))) for _ in range(1000)])`
			```

Improvements to documentation and error messages. (#221) 2017-01-19 20:27:46 -08:00			`### Stopping Ray`
Attempt to start web UI when starting Ray. (#269) * Attempt to start web UI when starting Ray. * Add instructions for using web UI to cluster documentation. * Don't check if port 8080 is open. * Remove print statement. 2017-02-12 15:17:58 -08:00
updated cluster documentation (#216) 2017-01-19 13:59:54 -08:00			When you want to stop the Ray processes, run `./ray/scripts/stop_ray.sh`
			`on each node.`

Attempt to start web UI when starting Ray. (#269) * Attempt to start web UI when starting Ray. * Add instructions for using web UI to cluster documentation. * Don't check if port 8080 is open. * Remove print statement. 2017-02-12 15:17:58 -08:00			`### Using the Web UI on a Cluster`

			`If you followed the instructions for setting up the web UI, then`
			`./ray/scripts/start_ray.sh --head` will attempt to start a Python 3 webserver
			`on the head node. In order to view the web UI from a machine that is not part of`
			`the cluster (like your laptop), you can use SSH port forwarding. The web UI`
			`requires ports 8080 and 8888, which you can forward using a command like the`
			`following.`

			```
			`ssh -L 8080:localhost:8080 -L 8888:localhost:8888 ubuntu@<head-node-public-ip>`
			```

			`Then you can view the web UI on your laptop by navigating to`
			`http://localhost:8080` in a browser.

			`#### Troubleshooting the Web UI`

			`Note that to use the web UI, additional setup is required. In particular, you`
			must be using Python 3 or you must at least have `python3` on your path.

			`If the web UI doesn't work, it's possible that the web UI processes were never`
			started (check `ps aux \| grep polymer` and `ps aux \| grep ray_ui.py`). If they
			`were started, see if you can fetch the web UI from within the head node (try`
			`wget http://localhost:8080` on the head node).