ray/doc/source/ray-core/tasks/patterns/fine-grained-tasks.rst
Eric Liang c8f207f746
[docs] Core docs refactor (#23216)
This PR makes a number of major overhauls to the Ray core docs:

Add a key-concepts section for {Tasks, Actors, Objects, Placement Groups, Env Deps}.
Re-org the user guide to align with key concepts.
Rewrite the walkthrough to link to mini-walkthroughs in the key concept sections.
Minor tweaks and additional transition material.
2022-03-17 11:26:17 -07:00

43 lines
1.3 KiB
ReStructuredText
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

Antipattern: Too fine-grained tasks
===================================
**TLDR:** Avoid over-parallelizing. Parallelizing tasks has higher overhead than using normal functions.
Parallelizing or distributing tasks usually comes with higher overhead than an ordinary function call. Therefore, if you parallelize a function that executes very quickly, the overhead could take longer than the actual function call!
To handle this problem, we should be careful about parallelizing too much. If you have a function or task thats too small, you can use a technique called batching to make your tasks do more meaningful work in a single task.
Code example
------------
**Antipattern:**
.. code-block:: python
@ray.remote
def double(number):
return number * 2
numbers = list(range(10000))
doubled_numbers = []
for i in numbers:
doubled_numbers.append(ray.get(double.remote(i)))
**Better approach:** Use batching.
.. code-block:: python
@ray.remote
def double_list(list_of_numbers):
return [number * 2 for number in list_of_numbers]
numbers = list(range(10000))
doubled_list_refs = []
BATCH_SIZE = 100
for i in range(0, len(numbers), BATCH_SIZE):
batch = numbers[i : i + BATCH_SIZE]
doubled_list_refs.append(double_list.remote(batch))
doubled_numbers = []