Is it possible to disable sorting in hadoop?

My work does not require sorting, but just information about key aggregation. Therefore, I think that if you can disable the sorting of all information in order of increasing productivity.


Note. I cannot set the number of reducers to zero, because I need to aggregate data between many cartographers. I was just not interested in a sorted result with one gear.

+6
source share
1 answer

One of the main goals of sorting map output is that when tuples reach the reducer, the reducer must do) to trigger the reducer's task, with a sorted list of map output it can make the list simply by sequential scanning (when it sees another key, and then just create a new list), if the output of the card is not sorted, it must scan the entire list to form a list with the same key.

0
source

Source: https://habr.com/ru/post/907283/


All Articles