ray/dashboard/modules/job
Archit Kulkarni 058c239cf1
[runtime env] Test common failure scenarios (#25977)
Tests the following failure scenarios:
- Fail to upload data in `ray.init()` (`working_dir`, `py_modules`)
- Eager install fails in `ray.init()` for some other reason (bad `pip` package)
- Fail to download data from GCS (`working_dir`)

Improves the following error message cases:
- Return RuntimeEnvSetupError on failure to upload working_dir or py_modules
- Return RuntimeEnvSetupError on failure to download files from GCS during runtime env setup

Not covered in this PR:
- RPC to agent fails (This is extremely rare because the Raylet and agent are on the same node.)
- Agent is not started or dead (We don't need to worry about this because the Raylet fate shares with the agent.)

The approach is to use environment variables to induce failures in various places.  The alternative would be to refactor the packaging code to use dependency injection for the Internal KV client so that we can pass in a fake. I'm not sure how much of an improvement this would be.  I think we'd still have to set an environment variable to pass in the fake client, because these are essentially e2e tests of `ray.init()` and we don't have an API to pass it in.
2022-08-15 11:35:56 -05:00
..
tests [runtime env] Test common failure scenarios (#25977) 2022-08-15 11:35:56 -05:00
__init__.py Job module without submission (#13081) 2020-12-31 11:12:17 +08:00
cli.py Fix the jobs tab in the beta dashboard and fill it with data from both "submission" jobs and "driver" jobs (#25902) 2022-07-27 02:39:52 -07:00
common.py Convert job_manager to be async (#27123) 2022-08-05 19:33:49 -07:00
job_head.py Convert job_manager to be async (#27123) 2022-08-05 19:33:49 -07:00
job_manager.py Add maximum number of characters in logs output for jobs status message (#27581) 2022-08-08 20:24:51 -07:00
pydantic_models.py Fix the jobs tab in the beta dashboard and fill it with data from both "submission" jobs and "driver" jobs (#25902) 2022-07-27 02:39:52 -07:00
sdk.py Fix the jobs tab in the beta dashboard and fill it with data from both "submission" jobs and "driver" jobs (#25902) 2022-07-27 02:39:52 -07:00
utils.py Add maximum number of characters in logs output for jobs status message (#27581) 2022-08-08 20:24:51 -07:00