Enable checking of the ray core module, excluding serve, workflows, and tune, in ./ci/lint/check_api_annotations.py. This required moving many files to ray._private and associated fixes.
This is the PR to implement ray log to the server side. The PR is continued from #24068.
The PR supports two endpoints;
/api/v0/logs # list logs of the node id filtered by the given glob.
/api/v0/logs/{[file | stream]}?filename&pid&actor_id&task_id&interval&lines # Stream the requested file log. The filename can be inferred by pid/actor_id/task_id
Some tests need to be re-written, I will do it soon.
As a follow-up after this PR, there will be 2 PRs.
PR to add actual CLI
PR to remove in-memory cached logs and do on-demand query for actor/worker logs
The test verifies the first line 43~51 bytes are "dashboard"
But due to recent code addition to head.py, the line where logs are written became 2 digits -> 3 digits
Previously,
2022-04-18 23:23:56,946 INFO head.py:[less than 100] -- Dashboard head grpc address: 127.0.0.1:57208
Now
2022-04-18 23:23:56,946 INFO head.py:101 -- Dashboard head grpc address: 127.0.0.1:57208
So we should increase the bytes range.
* Provide a utility to ping a Ray cluster and verify it has the same Ray version. This is useful to check if a Ray cluster is available at a given address, without connecting to the cluster with the more heavyweight ray.init(). This utility is integrated with ray memory to provide a better error message when the Ray cluster is unavailable. There seem to be user demand for exposing this as an API as well.
* Improve the error message when the address provided to Ray does not contain port.