Some Ray client users are likely seeing an issue similar to #7084. Inside a container, connecting to localhost: fails but connecting to 127.0.0.1: succeeds. Changing Ray client to use 127.0.0.1 for localhost connection / serving should fix the issue.
This is the stack of PRs that supports out or order execution for threaded/async actors. Next PR #20149
At a high level, threaded actor/async actor already don't guarantee execution order, and the current "sequential" order implementation has caused some confusion and inconvenience. Please refer to #19822 for detailed discussion.
The major changes of this stack is
introduce OutOfOrderActorSubmitQueue ([Core][actor out-of-order execution 3/n] Introducing out-of-order actor submit queue #20150)
Specifically, we have a per-client task submit queue, which guarantees the sequential order of task submission. In [Core][actor out-of-order execution 3/n] Introducing out-of-order actor submit queue #20150 we introduce OutOfOrderActorSubmitQueue, which relaxes the guarantee; it send the task over the network as soon as its dependency is resolved.
-- there are 2 PRs ([Core][actor out-of-order execution 1/n] Move CoreWorkerDirectActorTaskSubmitter into a separate file #20148 and [Core][actor out-of-order execution 2/n] create abstraction for the queuing logic on the client/actor submission. #20149) precedes this PR, which refactor the actor submission logic to make the abstraction possible.
OutOfOrderActorSchedullingQueue ([Core][actor out-of-order execution 5/n] implement out-of-order scheduling queue #20176)
Similarly, we also have a per-client task scheduling queue on the actor to ensure tasks are executed according to the submission order (sequence_no). OutOfOrderActorSchedullingQueue relaxes the guarantee by enqueuing the task as soon as all their dependencies are resolved.
-- there is one PR ([Core][actor out-of-order execution 4/n] refactor the actor receiver code #20160) precedes this PR, which refactor the actor scheduling logic to make it the code easier to read; however this one is optional.
plumbing PR ([Core][actor out-of-order execution 6/n] plumbing work to make it work e2e #20177)
This PR enables the out of order execution by introducing options(execute_out_of_order=True); and create actor client/server components according to the configuration.
There are something the PR hasn't touched upon is the restart/retry guarantees, which might need some discussion.
This PR we separate client from server code for Actor task submission. This makes the follow up change easier.