..important:: For the multi-node setting, you must first run ``ray start`` on the command line to start the Ray cluster services on the machine before ``ray.init`` in Python to connect to the cluster services. On a single machine, you can run ``ray.init()`` without ``ray start``, which will both start the Ray cluster services and connect to them.
(There is not currently a stable way to change the root temporary directory when calling ``ray.init()``, but if you need to, you can provide the ``_temp_dir`` argument to ``ray.init()``.)
Ray requires bi-directional communication among its nodes in a cluster. Each of node is supposed to open specific ports to receive incoming network requests.
All Nodes
~~~~~~~~~
-``--node-manager-port``: Raylet port for node manager. Default: Random value.
-``--object-manager-port``: Raylet port for object manager. Default: Random value.
Port numbers are how Ray disambiguates input and output to and from multiple workers on a single node. Each worker will take input and give output on a single port number. Thus, for example, by default, there is a maximum of 1,000 workers on each node, irrespective of number of CPUs.
In general, it is recommended to give Ray a wide range of possible worker ports, in case any of those ports happen to be in use by some other program on your machine. However, when debugging it is useful to explicitly specify a short list of worker ports such as ``--worker-port-list=10000,10001,10002,10003,10004`` (note that this will limit the number of workers, just like specifying a narrow range).
grpc.aio._call.AioRpcError: <AioRpcError of RPC that terminated with:
status = StatusCode.UNAVAILABLE
details = "failed to connect to all addresses"
debug_error_string = "{"description":"Failed to pick subchannel","file":"src/core/ext/filters/client_channel/client_channel.cc","file_line":4165,"referenced_errors":[{"description":"failed to connect to all addresses","file":"src/core/ext/filters/client_channel/lb_policy/pick_first/pick_first.cc","file_line":397,"grpc_status":14}]}"
(Also, you will not be able to access the dashboard.)
If you see that error, check whether the ``--dashboard-port`` is accessible
with ``nc`` or ``nmap`` (or your browser).
..code-block:: bash
$ nmap -sV --reason -p 8265 $HEAD_ADDRESS
Nmap scan report for compute04.berkeley.edu (123.456.78.910)
Host is up, received reset ttl 60 (0.00065s latency).
rDNS record for 123.456.78.910: compute04.berkeley.edu
PORT STATE SERVICE REASON VERSION
8265/tcp open http syn-ack ttl 60 aiohttp 3.7.2 (Python 3.8)
Service detection performed. Please report any incorrect results at https://nmap.org/submit/ .
Note that the dashboard runs as a separate subprocess which can crash invisibly
in the background, so even if you checked port 8265 earlier, the port might be
closed *now* (for the prosaic reason that there is no longer a service running
on it). This also means that if that port is unreachable, if you ``ray stop``
and ``ray start``, it may become reachable again due to the dashboard
restarting.
If you don't want the dashboard, set ``--include-dashboard=false``.
..important:: For the multi-node setting, you must first run ``ray start`` on the command line to start the Ray cluster services on the machine before ``Ray.init()`` in Java to connect to the cluster services. On a single machine, you can run ``Ray.init()`` without ``ray start``, which will both start the Ray cluster services and connect to them.
.._code_search_path:
Code Search Path
~~~~~~~~~~~~~~~~
If you want to run a Java application in cluster mode, you must first run ``ray start`` to start the Ray cluster. In addition to any ``ray start`` parameters mentioned above, you must add ``--code-search-path`` to tell Ray where to load jars when starting Java workers. Your jar files must be distributed to all nodes of the Ray cluster before running your code, and this parameter must be set on both the head node and non-head nodes.
..code-block:: bash
$ ray start ... --code-search-path=/path/to/jars
The ``/path/to/jars`` here points to a directory which contains jars. All jars in the directory will be loaded by workers. You can also provide multiple directories for this parameter.
..code-block:: bash
$ ray start ... --code-search-path=/path/to/jars1:/path/to/jars2:/path/to/pys1:/path/to/pys2
Code search path is also used for loading Python code if it's specified. This is required for :ref:`cross_language`. If code search path is specified, you can only run Python remote functions which can be found in the code search path.
You don't need to configure code search path if you run a Java application in single machine mode.
..note:: Currently we don't provide a way to configure Ray when running a Java application in single machine mode. If you need to configure Ray, run ``ray start`` to start the Ray cluster first.
Driver Options
~~~~~~~~~~~~~~
There is a limited set of options for Java drivers. They are not for configuring the Ray cluster, but only for configuring the driver.
Ray uses `Typesafe Config <https://lightbend.github.io/config/>`__ to read options. There are several ways to set options:
- System properties. You can configure system properties either by adding options in the format of ``-Dkey=value`` in the driver command line, or by invoking ``System.setProperty("key", "value");`` before ``Ray.init()``.
- A `HOCON format <https://github.com/lightbend/config/blob/master/HOCON.md>`__ configuration file. By default, Ray will try to read the file named ``ray.conf`` in the root of the classpath. You can customize the location of the file by setting system property ``ray.config-file`` to the path of the file.
..note:: Options configured by system properties have higher priority than options configured in the configuration file.
The list of available driver options:
-``ray.address``
- The cluster address if the driver connects to an existing Ray cluster. If it is empty, a new Ray cluster will be created.
- Type: ``String``
- Default: empty string.
-``ray.local-mode``
- If it's set to ``true``, the driver will run in :ref:`local_mode`.