ray/doc/source/development.rst

Development Tips
================

Compilation
-----------

To speed up compilation, be sure to install Ray with

.. code-block:: shell

 cd ray/python
 pip install -e . --verbose

The ``-e`` means "editable", so changes you make to files in the Ray
directory will take effect without reinstalling the package. In contrast, if
you do ``python setup.py install``, files will be copied from the Ray
directory to a directory of Python packages (often something like
``/home/ubuntu/anaconda3/lib/python3.6/site-packages/ray``). This means that
changes you make to files in the Ray directory will not have any effect.

If you run into **Permission Denied** errors when running ``pip install``,
you can try adding ``--user``. You may also need to run something like ``sudo
chown -R $USER /home/ubuntu/anaconda3`` (substituting in the appropriate
path).

If you make changes to the C++ or Python files, you will need to run the build so C++ code is recompiled and/or Python files are redeployed in `ray/python`.
However, you do not need to rerun ``pip install -e .``. Instead, you can
recompile much more quickly by doing

.. code-block:: shell

 cd ray
 bash build.sh

This command is not enough to recompile all C++ unit tests. To do so, see
`Testing locally`_.

Debugging
---------

Starting processes in a debugger
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
When processes are crashing, it is often useful to start them in a debugger.
Ray currently allows processes to be started in the following:

- valgrind
- the valgrind profiler
- the perftools profiler
- gdb
- tmux

To use any of these tools, please make sure that you have them installed on
your machine first (``gdb`` and ``valgrind`` on MacOS are known to have issues).
Then, you can launch a subset of ray processes by adding the environment
variable ``RAY_{PROCESS_NAME}_{DEBUGGER}=1``. For instance, if you wanted to
start the raylet in ``valgrind``, then you simply need to set the environment
variable ``RAY_RAYLET_VALGRIND=1``.

To start a process inside of ``gdb``, the process must also be started inside of
``tmux``. So if you want to start the raylet in ``gdb``, you would start your
Python script with the following:

.. code-block:: bash

 RAY_RAYLET_GDB=1 RAY_RAYLET_TMUX=1 python

You can then list the ``tmux`` sessions with ``tmux ls`` and attach to the
appropriate one.

You can also get a core dump of the ``raylet`` process, which is especially
useful when filing `issues`_. The process to obtain a core dump is OS-specific,
but usually involves running ``ulimit -c unlimited`` before starting Ray to
allow core dump files to be written.

Inspecting Redis shards
~~~~~~~~~~~~~~~~~~~~~~~
To inspect Redis, you can use the global state API. The easiest way to do this
is to start or connect to a Ray cluster with ``ray.init()``, then query the API
like so:

.. code-block:: python

 ray.init()
 ray.nodes()
 # Returns current information about the nodes in the cluster, such as:
 # [{'ClientID': '2a9d2b34ad24a37ed54e4fcd32bf19f915742f5b',
 #   'IsInsertion': True,
 #   'NodeManagerAddress': '1.2.3.4',
 #   'NodeManagerPort': 43280,
 #   'ObjectManagerPort': 38062,
 #   'ObjectStoreSocketName': '/tmp/ray/session_2019-01-21_16-28-05_4216/sockets/plasma_store',
 #   'RayletSocketName': '/tmp/ray/session_2019-01-21_16-28-05_4216/sockets/raylet',
 #   'Resources': {'CPU': 8.0, 'GPU': 1.0}}]

To inspect the primary Redis shard manually, you can also query with commands
like the following.

.. code-block:: python

 r_primary = ray.worker.global_worker.redis_client
 r_primary.keys("*")

To inspect other Redis shards, you will need to create a new Redis client.
For example (assuming the relevant IP address is ``127.0.0.1`` and the
relevant port is ``1234``), you can do this as follows.

.. code-block:: python

 import redis
 r = redis.StrictRedis(host='127.0.0.1', port=1234)

You can find a list of the relevant IP addresses and ports by running

.. code-block:: python

 r_primary.lrange('RedisShards', 0, -1)

.. _backend-logging:

Backend logging
~~~~~~~~~~~~~~~
The ``raylet`` process logs detailed information about events like task
execution and object transfers between nodes. To set the logging level at
runtime, you can set the ``RAY_BACKEND_LOG_LEVEL`` environment variable before
starting Ray. For example, you can do:

.. code-block:: shell

 export RAY_BACKEND_LOG_LEVEL=debug
 ray start

This will print any ``RAY_LOG(DEBUG)`` lines in the source code to the
``raylet.err`` file, which you can find in the `Temporary Files`_.

Testing locally
---------------
Suppose that one of the tests (e.g., ``test_basic.py``) is failing. You can run
that test locally by running ``python -m pytest -v python/ray/tests/test_basic.py``. However, doing so will run all of the tests which can take a while. To run a specific test that is
failing, you can do

.. code-block:: shell

 cd ray
 python -m pytest -v python/ray/tests/test_basic.py::test_keyword_args

When running tests, usually only the first test failure matters. A single
test failure often triggers the failure of subsequent tests in the same
script.

To compile and run all C++ tests, you can run:

.. code-block:: shell

 cd ray
 bazel test $(bazel query 'kind(cc_test, ...)')


Linting
-------

**Running linter locally:** To run the Python linter on a specific file, run
something like ``flake8 ray/python/ray/worker.py``. You may need to first run
``pip install flake8``.

**Autoformatting code**. We use `yapf <https://github.com/google/yapf>`_ for
linting, and the config file is located at ``.style.yapf``. We recommend
running ``scripts/yapf.sh`` prior to pushing to format changed files.
Note that some projects such as dataframes and rllib are currently excluded.


.. _`issues`: https://github.com/ray-project/ray/issues
.. _`Temporary Files`: http://ray.readthedocs.io/en/latest/tempfile.html
Add some development tips to documentation. (#1426) * Add some development tips to documentation. * Add more tips. * Add permission denied help. 2018-01-19 16:16:45 -08:00			`Development Tips`
			`================`

Improve backend debug logging, refactor scheduling queues (#3819) 2019-01-26 00:15:48 -08:00			`Compilation`
			`-----------`
Add some development tips to documentation. (#1426) * Add some development tips to documentation. * Add more tips. * Add permission denied help. 2018-01-19 16:16:45 -08:00
Improve backend debug logging, refactor scheduling queues (#3819) 2019-01-26 00:15:48 -08:00			`To speed up compilation, be sure to install Ray with`
Add some development tips to documentation. (#1426) * Add some development tips to documentation. * Add more tips. * Add permission denied help. 2018-01-19 16:16:45 -08:00
Improve backend debug logging, refactor scheduling queues (#3819) 2019-01-26 00:15:48 -08:00			`.. code-block:: shell`
Add some development tips to documentation. (#1426) * Add some development tips to documentation. * Add more tips. * Add permission denied help. 2018-01-19 16:16:45 -08:00
Improve backend debug logging, refactor scheduling queues (#3819) 2019-01-26 00:15:48 -08:00			`cd ray/python`
			`pip install -e . --verbose`
Add some development tips to documentation. (#1426) * Add some development tips to documentation. * Add more tips. * Add permission denied help. 2018-01-19 16:16:45 -08:00
Improve backend debug logging, refactor scheduling queues (#3819) 2019-01-26 00:15:48 -08:00			The ``-e`` means "editable", so changes you make to files in the Ray
			`directory will take effect without reinstalling the package. In contrast, if`
			you do ``python setup.py install``, files will be copied from the Ray
			`directory to a directory of Python packages (often something like`
			``/home/ubuntu/anaconda3/lib/python3.6/site-packages/ray``). This means that
			`changes you make to files in the Ray directory will not have any effect.`
Add some development tips to documentation. (#1426) * Add some development tips to documentation. * Add more tips. * Add permission denied help. 2018-01-19 16:16:45 -08:00
Improve backend debug logging, refactor scheduling queues (#3819) 2019-01-26 00:15:48 -08:00			If you run into Permission Denied errors when running ``pip install``,
			you can try adding ``--user``. You may also need to run something like ``sudo
			chown -R $USER /home/ubuntu/anaconda3`` (substituting in the appropriate
			`path).`
Add some development tips to documentation. (#1426) * Add some development tips to documentation. * Add more tips. * Add permission denied help. 2018-01-19 16:16:45 -08:00
Doc enhancement: use build.sh for ray, clarification on how rllib selects VisionNetwork, note on setup-dev.py for rllib. (#6092) 2019-12-02 22:19:01 -08:00			If you make changes to the C++ or Python files, you will need to run the build so C++ code is recompiled and/or Python files are redeployed in `ray/python`.
Improve backend debug logging, refactor scheduling queues (#3819) 2019-01-26 00:15:48 -08:00			However, you do not need to rerun ``pip install -e .``. Instead, you can
			`recompile much more quickly by doing`
Add some development tips to documentation. (#1426) * Add some development tips to documentation. * Add more tips. * Add permission denied help. 2018-01-19 16:16:45 -08:00
Improve backend debug logging, refactor scheduling queues (#3819) 2019-01-26 00:15:48 -08:00			`.. code-block:: shell`
Add some development tips to documentation. (#1426) * Add some development tips to documentation. * Add more tips. * Add permission denied help. 2018-01-19 16:16:45 -08:00
[doc] Update developer docs with bazel instructions (#4944) 2019-06-06 18:18:24 -07:00			`cd ray`
Doc enhancement: use build.sh for ray, clarification on how rllib selects VisionNetwork, note on setup-dev.py for rllib. (#6092) 2019-12-02 22:19:01 -08:00			`bash build.sh`
[doc] Update developer docs with bazel instructions (#4944) 2019-06-06 18:18:24 -07:00
			`This command is not enough to recompile all C++ unit tests. To do so, see`
			`Testing locally`_.
Add some development tips to documentation. (#1426) * Add some development tips to documentation. * Add more tips. * Add permission denied help. 2018-01-19 16:16:45 -08:00
Improve backend debug logging, refactor scheduling queues (#3819) 2019-01-26 00:15:48 -08:00			`Debugging`
			`---------`
Add some development tips to documentation. (#1426) * Add some development tips to documentation. * Add more tips. * Add permission denied help. 2018-01-19 16:16:45 -08:00
Improve backend debug logging, refactor scheduling queues (#3819) 2019-01-26 00:15:48 -08:00			`Starting processes in a debugger`
			`~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~`
Add documentation on how to use debug tools (#4000) 2019-02-14 13:50:21 -08:00			`When processes are crashing, it is often useful to start them in a debugger.`
			`Ray currently allows processes to be started in the following:`

			`- valgrind`
			`- the valgrind profiler`
			`- the perftools profiler`
			`- gdb`
			`- tmux`

			`To use any of these tools, please make sure that you have them installed on`
			your machine first (``gdb`` and ``valgrind`` on MacOS are known to have issues).
			`Then, you can launch a subset of ray processes by adding the environment`
			variable ``RAY_{PROCESS_NAME}_{DEBUGGER}=1``. For instance, if you wanted to
			start the raylet in ``valgrind``, then you simply need to set the environment
			variable ``RAY_RAYLET_VALGRIND=1``.

			To start a process inside of ``gdb``, the process must also be started inside of
			``tmux``. So if you want to start the raylet in ``gdb``, you would start your
			`Python script with the following:`

			`.. code-block:: bash`

			`RAY_RAYLET_GDB=1 RAY_RAYLET_TMUX=1 python`
Move global state API out of global_state object. (#4857) 2019-05-26 11:27:53 -07:00
Add documentation on how to use debug tools (#4000) 2019-02-14 13:50:21 -08:00			You can then list the ``tmux`` sessions with ``tmux ls`` and attach to the
			`appropriate one.`
Add some development tips to documentation. (#1426) * Add some development tips to documentation. * Add more tips. * Add permission denied help. 2018-01-19 16:16:45 -08:00
Improve backend debug logging, refactor scheduling queues (#3819) 2019-01-26 00:15:48 -08:00			You can also get a core dump of the ``raylet`` process, which is especially
			useful when filing `issues`_. The process to obtain a core dump is OS-specific,
			but usually involves running ``ulimit -c unlimited`` before starting Ray to
			`allow core dump files to be written.`
Add some development tips to documentation. (#1426) * Add some development tips to documentation. * Add more tips. * Add permission denied help. 2018-01-19 16:16:45 -08:00
Improve backend debug logging, refactor scheduling queues (#3819) 2019-01-26 00:15:48 -08:00			`Inspecting Redis shards`
			`~~~~~~~~~~~~~~~~~~~~~~~`
Move global state API out of global_state object. (#4857) 2019-05-26 11:27:53 -07:00			`To inspect Redis, you can use the global state API. The easiest way to do this`
			is to start or connect to a Ray cluster with ``ray.init()``, then query the API
			`like so:`
Add some development tips to documentation. (#1426) * Add some development tips to documentation. * Add more tips. * Add permission denied help. 2018-01-19 16:16:45 -08:00
Improve backend debug logging, refactor scheduling queues (#3819) 2019-01-26 00:15:48 -08:00			`.. code-block:: python`
Add some development tips to documentation. (#1426) * Add some development tips to documentation. * Add more tips. * Add permission denied help. 2018-01-19 16:16:45 -08:00
Improve backend debug logging, refactor scheduling queues (#3819) 2019-01-26 00:15:48 -08:00			`ray.init()`
Move global state API out of global_state object. (#4857) 2019-05-26 11:27:53 -07:00			`ray.nodes()`
Improve backend debug logging, refactor scheduling queues (#3819) 2019-01-26 00:15:48 -08:00			`# Returns current information about the nodes in the cluster, such as:`
			`# [{'ClientID': '2a9d2b34ad24a37ed54e4fcd32bf19f915742f5b',`
[GCS] Move node resource info from client table to resource table (#5050) 2019-07-11 13:17:19 +08:00			`# 'IsInsertion': True,`
Improve backend debug logging, refactor scheduling queues (#3819) 2019-01-26 00:15:48 -08:00			`# 'NodeManagerAddress': '1.2.3.4',`
			`# 'NodeManagerPort': 43280,`
			`# 'ObjectManagerPort': 38062,`
			`# 'ObjectStoreSocketName': '/tmp/ray/session_2019-01-21_16-28-05_4216/sockets/plasma_store',`
			`# 'RayletSocketName': '/tmp/ray/session_2019-01-21_16-28-05_4216/sockets/raylet',`
			`# 'Resources': {'CPU': 8.0, 'GPU': 1.0}}]`
Add some development tips to documentation. (#1426) * Add some development tips to documentation. * Add more tips. * Add permission denied help. 2018-01-19 16:16:45 -08:00
Improve backend debug logging, refactor scheduling queues (#3819) 2019-01-26 00:15:48 -08:00			`To inspect the primary Redis shard manually, you can also query with commands`
			`like the following.`
Improve yapf speed and document its usage (#2160) * Allow yapf to lint individual files * Add tip for using yapf * Update doc * Update script to autoformat changed py files The new default is for the script to only updated changed files to encourage using it as a pre-push hook. Travis still checks all since it's not that big an increase to runtime. * Exclude formatting thirdparty/autogen py files * Symlink .travis -> scripts Hidden directories may get glossed over otherwise. * .travis -> scripts in docs They are symlinks to the same thing, but `scripts` is more dev-friendly, while `.travis` is really only for Travis CI. * Document different yapf format functions Most devs will only need `format_changed`, and this is run by default. `format_changed` should be fast enough in most cases to work as a pre-commit hook. * Speed up yapf by only formatting changed files * Update docs 1. Mention how yapf can be used a pre-commit hook 2. rm `bash`, script is executable * Update yapf.sh * Update development.rst * Update yapf.sh * Use bash arrays for correct argument splitting Playing fast and loose with whitespace in bash is a terrible idea. * Only format non-excluded by default * Check changes against master Normally, the remote is called `origin`, but naming it explicit * Adding missing directory to `format_all` * Cleanup YAPF code Remove unused function and move around code to make clearer and adding lines give cleaner diffs. * Ensure correct files are autoformatted * Fix cmd line arg splitting Each arg has to be in its own set of quotes. * Diff against mergebase TIL there's a clean syntax for doing that, but it's too clever to belong in a shell script. We use `mapfile -t` to ensure no problems down the line with weird filenames. 2018-06-05 20:22:11 -07:00
Improve backend debug logging, refactor scheduling queues (#3819) 2019-01-26 00:15:48 -08:00			`.. code-block:: python`
Add some development tips to documentation. (#1426) * Add some development tips to documentation. * Add more tips. * Add permission denied help. 2018-01-19 16:16:45 -08:00
Improve backend debug logging, refactor scheduling queues (#3819) 2019-01-26 00:15:48 -08:00			`r_primary = ray.worker.global_worker.redis_client`
			`r_primary.keys("*")`
Add some development tips to documentation. (#1426) * Add some development tips to documentation. * Add more tips. * Add permission denied help. 2018-01-19 16:16:45 -08:00
Improve backend debug logging, refactor scheduling queues (#3819) 2019-01-26 00:15:48 -08:00			`To inspect other Redis shards, you will need to create a new Redis client.`
			For example (assuming the relevant IP address is ``127.0.0.1`` and the
			relevant port is ``1234``), you can do this as follows.
Add some development tips to documentation. (#1426) * Add some development tips to documentation. * Add more tips. * Add permission denied help. 2018-01-19 16:16:45 -08:00
Improve backend debug logging, refactor scheduling queues (#3819) 2019-01-26 00:15:48 -08:00			`.. code-block:: python`
Add some development tips to documentation. (#1426) * Add some development tips to documentation. * Add more tips. * Add permission denied help. 2018-01-19 16:16:45 -08:00
Improve backend debug logging, refactor scheduling queues (#3819) 2019-01-26 00:15:48 -08:00			`import redis`
			`r = redis.StrictRedis(host='127.0.0.1', port=1234)`
Add some development tips to documentation. (#1426) * Add some development tips to documentation. * Add more tips. * Add permission denied help. 2018-01-19 16:16:45 -08:00
Improve backend debug logging, refactor scheduling queues (#3819) 2019-01-26 00:15:48 -08:00			`You can find a list of the relevant IP addresses and ports by running`
Add some development tips to documentation. (#1426) * Add some development tips to documentation. * Add more tips. * Add permission denied help. 2018-01-19 16:16:45 -08:00
Improve backend debug logging, refactor scheduling queues (#3819) 2019-01-26 00:15:48 -08:00			`.. code-block:: python`
Add some development tips to documentation. (#1426) * Add some development tips to documentation. * Add more tips. * Add permission denied help. 2018-01-19 16:16:45 -08:00
Improve backend debug logging, refactor scheduling queues (#3819) 2019-01-26 00:15:48 -08:00			`r_primary.lrange('RedisShards', 0, -1)`
Add some development tips to documentation. (#1426) * Add some development tips to documentation. * Add more tips. * Add permission denied help. 2018-01-19 16:16:45 -08:00
Improve backend debug logging, refactor scheduling queues (#3819) 2019-01-26 00:15:48 -08:00			`.. _backend-logging:`
Add some development tips to documentation. (#1426) * Add some development tips to documentation. * Add more tips. * Add permission denied help. 2018-01-19 16:16:45 -08:00
Improve backend debug logging, refactor scheduling queues (#3819) 2019-01-26 00:15:48 -08:00			`Backend logging`
			`~~~~~~~~~~~~~~~`
			The ``raylet`` process logs detailed information about events like task
			`execution and object transfers between nodes. To set the logging level at`
			runtime, you can set the ``RAY_BACKEND_LOG_LEVEL`` environment variable before
			`starting Ray. For example, you can do:`

			`.. code-block:: shell`

			`export RAY_BACKEND_LOG_LEVEL=debug`
			`ray start`

			This will print any ``RAY_LOG(DEBUG)`` lines in the source code to the
			``raylet.err`` file, which you can find in the `Temporary Files`_.

			`Testing locally`
			`---------------`
Updated test script paths in documentation (#4170) 2019-02-26 16:14:55 -08:00			Suppose that one of the tests (e.g., ``test_basic.py``) is failing. You can run
			that test locally by running ``python -m pytest -v python/ray/tests/test_basic.py``. However, doing so will run all of the tests which can take a while. To run a specific test that is
Improve backend debug logging, refactor scheduling queues (#3819) 2019-01-26 00:15:48 -08:00			`failing, you can do`

			`.. code-block:: shell`

			`cd ray`
Updated test script paths in documentation (#4170) 2019-02-26 16:14:55 -08:00			`python -m pytest -v python/ray/tests/test_basic.py::test_keyword_args`
Improve backend debug logging, refactor scheduling queues (#3819) 2019-01-26 00:15:48 -08:00
			`When running tests, usually only the first test failure matters. A single`
			`test failure often triggers the failure of subsequent tests in the same`
			`script.`

[doc] Update developer docs with bazel instructions (#4944) 2019-06-06 18:18:24 -07:00			`To compile and run all C++ tests, you can run:`

			`.. code-block:: shell`

			`cd ray`
			`bazel test $(bazel query 'kind(cc_test, ...)')`


Improve backend debug logging, refactor scheduling queues (#3819) 2019-01-26 00:15:48 -08:00			`Linting`
			`-------`

			`Running linter locally: To run the Python linter on a specific file, run`
[docs] rewrite (#5175) 2019-08-05 23:33:14 -07:00			something like ``flake8 ray/python/ray/worker.py``. You may need to first run
			``pip install flake8``.
Improve backend debug logging, refactor scheduling queues (#3819) 2019-01-26 00:15:48 -08:00
[docs] rewrite (#5175) 2019-08-05 23:33:14 -07:00			Autoformatting code. We use `yapf <https://github.com/google/yapf>`_ for
			linting, and the config file is located at ``.style.yapf``. We recommend
			running ``scripts/yapf.sh`` prior to pushing to format changed files.
			`Note that some projects such as dataframes and rllib are currently excluded.`
Improve backend debug logging, refactor scheduling queues (#3819) 2019-01-26 00:15:48 -08:00


			.. _`issues`: https://github.com/ray-project/ray/issues
			.. _`Temporary Files`: http://ray.readthedocs.io/en/latest/tempfile.html