mirror of
https://github.com/vale981/ray
synced 2025-03-06 10:31:39 -05:00

ray.init() will currently start a new Ray instance even if one is already existing, which is very confusing if you are a new user trying to go from local development to a cluster. This PR changes it so that, when no address is specified, we first try to find an existing Ray cluster that was created through `ray start`. If none is found, we will start a new one. This makes two changes to the ray.init() resolution order: 1. When `ray start` is called, the started cluster address was already written to a file called `/tmp/ray/ray_current_cluster`. For ray.init() and ray.init(address="auto"), we will first check this local file for an existing cluster address. The file is deleted on `ray stop`. If the file is empty, autodetect any running cluster (legacy behavior) if address="auto", or we will start a new local Ray instance if address=None. 2. When ray.init(address="local") is called, we will create a new local Ray instance, even if one is already existing. This behavior seems to be necessary mainly for `ray.client` use cases. This also surfaces the logs about which Ray instance we are connecting to. Previously these were hidden because we didn't set up the log until after connecting to Ray. So now Ray will log one of the following messages during ray.init: ``` (Connecting to existing Ray cluster at address: <IP>...) ...connection... (Started a local Ray cluster.| Connected to Ray Cluster.)( View the dashboard at <URL>) ``` Note that this changes the dashboard URL to be printed with `ray.init()` instead of when the dashboard is first started. Co-authored-by: Eric Liang <ekhliang@gmail.com>
116 lines
4.6 KiB
ReStructuredText
116 lines
4.6 KiB
ReStructuredText
Tracing
|
|
=======
|
|
To help debug and monitor Ray applications, Ray integrates with OpenTelemetry to make it easy to export traces to external tracing stacks such as Jaeger.
|
|
|
|
|
|
.. note::
|
|
|
|
Tracing is currently an experimental feature and under active development. APIs are subject to change.
|
|
|
|
Getting Started
|
|
---------------
|
|
First, install OpenTelemetry.
|
|
|
|
.. code-block:: shell
|
|
|
|
pip install opentelemetry-api==1.1.0
|
|
pip install opentelemetry-sdk==1.1.0
|
|
pip install opentelemetry-exporter-otlp==1.1.0
|
|
|
|
Tracing Startup Hook
|
|
--------------------
|
|
To enable tracing, you must provide a tracing startup hook with a function that will set up the :ref:`Tracer Provider <tracer-provider>`, :ref:`Remote Span Processors <remote-span-processors>`, and :ref:`Additional Instruments <additional-instruments>`. The tracing startup hook is expected to be a function that will be called with no args or kwargs. This hook needs to be available in the Python environment of all the worker processes.
|
|
|
|
Below is an example tracing startup hook that sets up the default tracing provider, exports spans to files in ``/tmp/spans``, and does not have any additional instruments.
|
|
|
|
.. code-block:: python
|
|
|
|
import ray
|
|
import os
|
|
from opentelemetry import trace
|
|
from opentelemetry.sdk.trace import TracerProvider
|
|
from opentelemetry.sdk.trace.export import (
|
|
ConsoleSpanExporter,
|
|
SimpleSpanProcessor,
|
|
)
|
|
|
|
|
|
def setup_tracing() -> None:
|
|
# Creates /tmp/spans folder
|
|
os.makedirs("/tmp/spans", exist_ok=True)
|
|
# Sets the tracer_provider. This is only allowed once per execution
|
|
# context and will log a warning if attempted multiple times.
|
|
trace.set_tracer_provider(TracerProvider())
|
|
trace.get_tracer_provider().add_span_processor(
|
|
SimpleSpanProcessor(
|
|
ConsoleSpanExporter(
|
|
out=open(f"/tmp/spans/{os.getpid()}.json", "a")
|
|
)
|
|
)
|
|
)
|
|
|
|
|
|
For open-source users who want to experiment with tracing, Ray has a default tracing startup hook that exports spans to the folder ``/tmp/spans``. To run using this default hook, you can run the following code sample to set up tracing and trace a simple Ray task.
|
|
|
|
.. tabbed:: ray start
|
|
|
|
.. code-block:: shell
|
|
|
|
$ ray start --head --tracing-startup-hook=ray.util.tracing.setup_local_tmp_tracing:setup_tracing
|
|
$ python
|
|
>>> ray.init()
|
|
>>> @ray.remote
|
|
def my_function():
|
|
return 1
|
|
|
|
obj_ref = my_function.remote()
|
|
|
|
.. tabbed:: ray.init()
|
|
|
|
.. code-block:: python
|
|
|
|
>>> ray.init(_tracing_startup_hook="ray.util.tracing.setup_local_tmp_tracing:setup_tracing")
|
|
>>> @ray.remote
|
|
def my_function():
|
|
return 1
|
|
|
|
obj_ref = my_function.remote()
|
|
|
|
If you want to provide your own custom tracing startup hook, provide your startup hook in the format of ``module:attribute`` where the attribute is the ``setup_tracing`` function to be run.
|
|
|
|
.. _tracer-provider:
|
|
|
|
Tracer Provider
|
|
~~~~~~~~~~~~~~~
|
|
This configures how to collect traces. View the TracerProvider API `here <https://open-telemetry.github.io/opentelemetry-python/sdk/trace.html#opentelemetry.sdk.trace.TracerProvider>`__.
|
|
|
|
.. _remote-span-processors:
|
|
|
|
Remote Span Processors
|
|
~~~~~~~~~~~~~~~~~~~~~~
|
|
This configures where to export traces to. View the SpanProcessor API `here <https://open-telemetry.github.io/opentelemetry-python/sdk/trace.html#opentelemetry.sdk.trace.SpanProcessor>`__.
|
|
|
|
Users who want to experiment with tracing can configure their remote span processors to export spans to a local JSON file. Serious users developing locally can push their traces to Jaeger containers via the `Jaeger exporter <https://open-telemetry.github.io/opentelemetry-python/exporter/jaeger/jaeger.html>`_.
|
|
|
|
.. _additional-instruments:
|
|
|
|
Additional Instruments
|
|
~~~~~~~~~~~~~~~~~~~~~~
|
|
If you are using a library that has built-in tracing support, the ``setup_tracing`` function you provide should also patch those libraries. You can find more documentation for the instrumentation of these libraries `here <https://github.com/open-telemetry/opentelemetry-python-contrib/tree/main/instrumentation>`_.
|
|
|
|
Custom Traces
|
|
*************
|
|
You can easily add custom tracing in your programs. Within your program, get the tracer object with ``trace.get_tracer(__name__)`` and start a new span with ``tracer.start_as_current_span(...)``.
|
|
|
|
See below for a simple example of adding custom tracing.
|
|
|
|
.. code-block:: python
|
|
|
|
from opentelemetry import trace
|
|
|
|
@ray.remote
|
|
def my_func():
|
|
tracer = trace.get_tracer(__name__)
|
|
|
|
with tracer.start_as_current_span("foo"):
|
|
print("Hello world from OpenTelemetry Python!")
|