mirror of
https://github.com/vale981/ray
synced 2025-03-06 18:41:40 -05:00
43 lines
1.3 KiB
ReStructuredText
43 lines
1.3 KiB
ReStructuredText
Antipattern: Too fine-grained tasks
|
||
===================================
|
||
|
||
**TLDR:** Avoid over-parallelizing. Parallelizing tasks has higher overhead than using normal functions.
|
||
|
||
Parallelizing or distributing tasks usually comes with higher overhead than an ordinary function call. Therefore, if you parallelize a function that executes very quickly, the overhead could take longer than the actual function call!
|
||
|
||
To handle this problem, we should be careful about parallelizing too much. If you have a function or task that’s too small, you can use a technique called batching to make your tasks do more meaningful work in a single task.
|
||
|
||
|
||
Code example
|
||
------------
|
||
|
||
**Antipattern:**
|
||
|
||
.. code-block:: python
|
||
|
||
@ray.remote
|
||
def double(number):
|
||
return number * 2
|
||
|
||
numbers = list(range(10000))
|
||
|
||
doubled_numbers = []
|
||
for i in numbers:
|
||
doubled_numbers.append(ray.get(double.remote(i)))
|
||
|
||
**Better approach:** Use batching.
|
||
|
||
.. code-block:: python
|
||
|
||
@ray.remote
|
||
def double_list(list_of_numbers):
|
||
return [number * 2 for number in list_of_numbers]
|
||
|
||
numbers = list(range(10000))
|
||
doubled_list_refs = []
|
||
BATCH_SIZE = 100
|
||
for i in range(0, len(numbers), BATCH_SIZE):
|
||
batch = numbers[i : i + BATCH_SIZE]
|
||
doubled_list_refs.append(double_list.remote(batch))
|
||
|
||
doubled_numbers = []
|