mirror of
https://github.com/vale981/ray
synced 2025-03-05 10:01:43 -05:00
![]() This PR fixes initializations artifacts related to the load metric summary and autoscaler summary. Load metrics summaries are defined to be Falsey if the autoscaler has never received a resource message from the GCS. We skip most autoscaler actions if load metrics is Falsey, because it doesn't makes sense to autoscale without load metrics. This also allows us to execute the TODO here: #22348 (comment) and remove the time.wait(). As for the autoscaler summary, it is possible for autoscaler.summary() to error outside of an autoscaler update in this scenario: The very first call to NodeProvider.non_terminated_nodes fails, self.non_terminated_nodes remains a None object, and autoscaler.summary() fails trying to get an attribute of this None object. The result is a confusing error message, as in #22515. This PR fixes that. Closes #22515 |
||
---|---|---|
.. | ||
autoscaler | ||
base-deps | ||
development | ||
examples | ||
kuberay-autoscaler | ||
ray | ||
ray-deps | ||
ray-ml | ||
ray-worker-container | ||
retag-lambda | ||
fix-docker-latest.sh | ||
README.md |
Overview of how the ray images are built:
Images without a "-cpu" or "-gpu" tag are built on ubuntu/focal
. They are just an alias for -cpu (e.g. ray:latest
is the same as ray:latest-cpu
).
ubuntu/focal
└── base-deps:cpu
└── ray-deps:cpu
└── ray:cpu
└── ray-ml:cpu
nvidia/cuda
└── base-deps:gpu
└── ray-deps:gpu
└── ray:gpu
└── ray-ml:gpu