How did the hadoop calculator calculate the percentage of work done?

I see that whenever I run the "Reduce map" task, the hadoop operation shows me the percentage of completed "Map" and "Reduce" operations.

I understand that both the mappers and gearboxes work in a distributed manner and can report how much they processed to the controller.

But how does the controller know the general data to process? If the controller tries to determine the size of all input files, I would like it to be inefficient. Is this some crude approximation?

Hadoop dashboard

+4
source share
1 answer

I have not read all the code related to this part in hadoop. but some thought about it, hope it helps

  • : , - , , , , , , .

  • : , , 5%, . , -, , . , "".

ps: : ( 33%), ( 66%) ( 100%)

0

Source: https://habr.com/ru/post/1526479/


All Articles