ray/dashboard
Archit Kulkarni 084f06f49a
[Doc] [Job submission] [Dashboard] Add tip for long runtime_env installation and improve error (#26911)
# Why are these changes needed?
The dashboard can display the message <actor> cannot be created because the Ray cluster cannot satisfy its resource requirements in the case where the runtime env setup is stalled. This PR updates this message to include the possibility of the runtime env setup failing.
This PR adds a tip to the Job Submission doc saying that if a job is stalled in PENDING, the runtime env setup may have stalled. It adds a pointer to the log files which should have more information.
The runtime env cannot stall forever, it fails after 10 minutes. This is a new feature added after the Ray 1.13 branch cut. In Ray <= 1.13, the runtime env can still stall forever.

# Related issue number
Closes #26332
2022-07-25 23:32:27 -07:00
..
client [Doc] [Job submission] [Dashboard] Add tip for long runtime_env installation and improve error (#26911) 2022-07-25 23:32:27 -07:00
modules [Core][State Observability] Truncate warning message is incorrect when filter is used (#26801) 2022-07-25 23:31:49 -07:00
tests [State Observability] Warn if callsite is disabled when ray list objects + raise exception on missing output (#26880) 2022-07-24 19:55:36 -07:00
__init__.py [Dashboard] New dashboard skeleton (#9099) 2020-07-27 11:34:47 +08:00
agent.py redo agent_pid -> agent_id (#25806) 2022-07-19 20:26:49 -07:00
BUILD Revert "Revert "Bump pytest from 5.4.3 to 7.0.1"" (#26525) 2022-07-18 21:21:19 -07:00
consts.py [Core | State Observability] Implement API Server (Dashboard) HTTP Requests Throttling (#26257) 2022-07-13 09:05:26 -07:00
dashboard.py [Core][cli][usability] ray stop prints errors during graceful shutdown (#25686) 2022-06-27 08:14:59 -07:00
datacenter.py [Dashboard] fix iterating over GPU processes (#23562) 2022-03-31 17:16:53 -07:00
head.py Auto reconnect for gcs aio client (#26673) 2022-07-19 13:11:09 -07:00
http_server_agent.py Revert "Revert "[Dashboard][Serve] Move Serve related endpoints to dashboard agent"" (#26336) 2022-07-06 19:37:30 -07:00
http_server_head.py [Core][cli][usability] ray stop prints errors during graceful shutdown (#25686) 2022-06-27 08:14:59 -07:00
k8s_utils.py [dashboard][kubernetes] Dashboard CPU and memory adjustments. (#21688) 2022-03-01 17:15:59 -08:00
memory_utils.py [State Observability] Summary APIs (#25672) 2022-06-22 06:21:50 -07:00
optional_deps.py [Dashboard] Agent in minimal ray installation (#21817) 2022-01-26 04:03:54 -08:00
optional_utils.py [dashboard] Update cluster_activities endpoint to use pydantic. (#26609) 2022-07-25 10:54:22 -07:00
state_aggregator.py [Core][State Observability] Truncate warning message is incorrect when filter is used (#26801) 2022-07-25 23:31:49 -07:00
utils.py Revert "Revert "[Dashboard][Serve] Move Serve related endpoints to dashboard agent"" (#26336) 2022-07-06 19:37:30 -07:00