mirror of
https://github.com/vale981/ray
synced 2025-03-06 02:21:39 -05:00
Commit 2cf4c72
("[ray client] Fix ctrl-c for ray.get() by setting a
short-server side timeout") introduced a short server-side timeout not
to block later operations.
However, the fix implicitly assumes that get() is complete within
MAX_BLOCKING_OPERATION_TIME_S (two seconds). This becomes a problem
when apps use heavy objects or limited network I/O bandwidth that
require more than two seconds to push all chunks. The current retry
logic needs to re-push from the beginning of chunks and block clients
with the infinite re-push.
I updated the logic to directly pass timeout if it is explicitly given.
Without timeout, it still uses MAX_BLOCKING_OPERATION_TIME_S for
polling with the short server-side timeout.
This commit is contained in:
parent
c73f02ded5
commit
e115545579
1 changed files with 7 additions and 2 deletions
|
@ -421,14 +421,19 @@ class Worker:
|
|||
else:
|
||||
deadline = time.monotonic() + timeout
|
||||
|
||||
max_blocking_operation_time = MAX_BLOCKING_OPERATION_TIME_S
|
||||
if "RAY_CLIENT_MAX_BLOCKING_OPERATION_TIME_S" in os.environ:
|
||||
max_blocking_operation_time = float(
|
||||
os.environ["RAY_CLIENT_MAX_BLOCKING_OPERATION_TIME_S"]
|
||||
)
|
||||
while True:
|
||||
if deadline:
|
||||
op_timeout = min(
|
||||
MAX_BLOCKING_OPERATION_TIME_S,
|
||||
max_blocking_operation_time,
|
||||
max(deadline - time.monotonic(), 0.001),
|
||||
)
|
||||
else:
|
||||
op_timeout = MAX_BLOCKING_OPERATION_TIME_S
|
||||
op_timeout = max_blocking_operation_time
|
||||
try:
|
||||
res = self._get(to_get, op_timeout)
|
||||
break
|
||||
|
|
Loading…
Add table
Reference in a new issue