Differences between Existing MapReduce and YARN (MRv2)

Will someone tell me what the differences are between the existing MapReduce and YARN, because I do not see all the obvious differences between the two?

PS: I'm asking for something like a comparison between the two.

Thanks!

+6
source share
2 answers

MRv1 uses JobTracker to create and assign tasks to data nodes, which can become a resource bottleneck when the cluster scales far enough (usually around 4000 nodes).

MRv2 (aka YARN, "Another Resource Negotiator") has a resource manager for each cluster, and each of the node data launches node Manager. For each task, one subordinate node will act as an Application Wizard, control resources / tasks, etc.

+11
source

MRv1 , also called Hadoop 1, where HDFS (resource management and scheduling) and MapReduce (Framework Framework) are closely related. Because of this, non-batch applications cannot be run on hadoop 1. This one has a namenode, so it does not provide high system availability and scalability.

MRv2 (aka Hadoop 2) in this version of hasoop, resource management and planning tasks are separated from MapReduce, which is shared by YARN (another resource negotiator). The resource management and planning layer is under the MapReduce layer. It also provides high system availability and scalability, as we can create redundant NameNodes. A new snapshot feature that enables us to back up file systems that help disaster recovery.

+3
source

Source: https://habr.com/ru/post/952576/


All Articles