* [core] nicer error message for unpickleable exceptions
I ran into a case where we had an exception that wasn't unpickleable:
```
pickle.loads(pickle.dumps(filelock.Timeout()))
```
When a filelock.Timeout is raised on a worker, it gets surfaced in a way
that makes ray look like it was responsible:
```
ray.exceptions.RaySystemError: System error: __init__() missing 1 required positional argument: 'lock_file'
```
This PR turns the following stacktrace:
```
return ray.get(refs, timeout=timeout)
File "/opt/conda/lib/python3.7/site-packages/ray/_private/client_mode_hook.py", line 82, in wrapper
return func(*args, **kwargs)
File "/opt/conda/lib/python3.7/site-packages/ray/worker.py", line 1623, in get
raise value
ray.exceptions.RaySystemError: System error: __init__() missing 1 required positional argument: 'lock_file'
traceback: Traceback (most recent call last):
File "/opt/conda/lib/python3.7/site-packages/ray/serialization.py", line 254, in deserialize_objects
obj = self._deserialize_object(data, metadata, object_ref)
File "/opt/conda/lib/python3.7/site-packages/ray/serialization.py", line 213, in _deserialize_object
return RayError.from_bytes(obj)
File "/opt/conda/lib/python3.7/site-packages/ray/exceptions.py", line 28, in from_bytes
return pickle.loads(ray_exception.serialized_exception)
TypeError: __init__() missing 1 required positional argument: 'lock_file'
```
into this:
```
...
return ray.get(refs, timeout=timeout)
File "/opt/conda/lib/python3.7/site-packages/ray/_private/client_mode_hook.py", line 82, in wrapper
return func(*args, **kwargs)
File "/opt/conda/lib/python3.7/site-packages/ray/worker.py", line 1623, in get
raise value
ray.exceptions.RaySystemError: System error: Failed to unpickle serialized exception
traceback: Traceback (most recent call last):
File "/opt/conda/lib/python3.7/site-packages/ray/exceptions.py", line 29, in from_bytes
return pickle.loads(ray_exception.serialized_exception)
TypeError: __init__() missing 1 required positional argument: 'lock_file'
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/opt/conda/lib/python3.7/site-packages/ray/serialization.py", line 254, in deserialize_objects
obj = self._deserialize_object(data, metadata, object_ref)
File "/opt/conda/lib/python3.7/site-packages/ray/serialization.py", line 213, in _deserialize_object
return RayError.from_bytes(obj)
File "/opt/conda/lib/python3.7/site-packages/ray/exceptions.py", line 31, in from_bytes
raise RuntimeError("Failed to unpickle serialized exception") from e
RuntimeError: Failed to unpickle serialized exception
```
* lint
* test_unpickleable_stacktrace
* lint
* .
* .
Co-authored-by: hauntsaninja <>
* Make binbacking prioritize nodes better
Make binpacking prefer nodes that match multiple
resource types.
* spelling
* order demands when binpacking, starting from complex ones
* add stability to resource demand ordering
* lint
* logging
* add comments
* +comment
* use set
Why are these changes needed?
This PR implements workflow.delete which allows users to delete the information in storage related to a workflow. (This assumes the workflow isn't currently running).
Related issue number
Closes#18848
In general, broadcasting changes to the replicas via the LongPollClient is hard to reason about (it circumvents our versioning semantics as there's no rolling update). Ideally we would only be using the LongPollClient to broadcast replica membership and nothing else.
This PR fixes#19183 by introducing three improvements:
String trainables are prefixed with Durable, e.g. DurablePPO
Durable trainables cannot be wrapped twice with tune.durable()
MRO resolution in _WrappedDurableTrainables indicates we already have a DurableTrainable - thus we catch this with a try/except block