SangBin Cho
a7e759317b
[State Observability API] Error handling ( #24413 )
...
This improves error handling per https://docs.google.com/document/d/1IeEsJOiurg-zctOcBjY-tQVbsCmURFSnUCTkx_4a7Cw/edit#heading=h.pdzl9cil9e8z (the RPC part).
Semantics
If all queries to the source failed, raise a RayStateApiException.
If partial queries are failed, warnings.warn the partial failure when print_api_stats=True. It is true for CLI. It is false when it is used within Python API or json / yaml format is required.
2022-05-24 03:56:49 -07:00
SangBin Cho
ec653e3196
[Nightly test] Move two line downloads to one line. ( #25061 )
...
It fixes the mysterious error when all cluster env build is failing when pip uninstall / pip install is written in 2 lines. The root cause will be fixed later
2022-05-22 00:07:03 -07:00
SangBin Cho
b9c30529d8
[Core/Observability 1/N] Add a "running" state to task status ( #24651 )
...
This PR adds 2 more states into TaskStatus
enum TaskStatus {
// The task is scheduled properly and waiting for execution.
// It includes time to deliver the task to the remote worker + queueing time
// from the execution side.
WAITING_FOR_EXECUTION = 5;
// The task that is running.
RUNNING = 6;
}
2022-05-16 05:39:05 -07:00
SangBin Cho
2bce07d4ce
[State API] List runtime env API ( #24126 )
...
This PR supports list runtime env API
2022-05-02 14:01:00 -07:00
SangBin Cho
73ed67e9e6
[State API] State api limit + Removing unnecessary modules ( #24098 )
...
This PR does
Move all routes into the same module, state_head.py
Support a limit feature.
2022-04-22 15:59:46 -07:00
SangBin Cho
30ab5458a7
[State Observability] Tasks and Objects API ( #23912 )
...
This PR implements ray list tasks and ray list objects APIs.
NOTE: You can ignore the merge conflict for now. It is because the first PR was reverted. There's a fix PR open now.
2022-04-21 18:45:03 -07:00
SangBin Cho
1c3329fa38
Revert "Revert "[State Observability] Basic functionality for central… ( #23933 )
...
…ized data (#23744 )" (#23918 )"
This reverts commit fb14e82
.
2022-04-18 21:15:43 -07:00
Amog Kamsetty
fb14e82242
Revert "[State Observability] Basic functionality for centralized data ( #23744 )" ( #23918 )
...
This reverts commit 51a4a1a802
.
breaking tune multinode tests and kuberay:test_autoscaling_e2e
2022-04-14 14:28:42 -07:00
SangBin Cho
51a4a1a802
[State Observability] Basic functionality for centralized data ( #23744 )
...
Support listing actor/pg/job/node/workers
Design doc: https://docs.google.com/document/d/1IeEsJOiurg-zctOcBjY-tQVbsCmURFSnUCTkx_4a7Cw/edit#heading=h.9ub9e6yvu9p2
Note that this PR doesn't contain any output except ids. I will update them in the follow-up PRs.
2022-04-14 07:33:18 -07:00