Check out this blog post from Cloudera, which explains the new memory management at YARN.
Here are the relevant bits:
... The detail of this change, which prevents starvation from starving in accordance with this new flexibility, is the concept of reserved containers. Imagine that two tasks are being performed, each of which has enough tasks to saturate more than the entire cluster. One job requires that each of its cards receive 1 GB, and another job requires that its cartographers receive 2 GB. Suppose that the first task begins and fills the entire cluster. Whenever one of his tasks ends, he will leave the 1 GB slot open. Despite the fact that the second work deserves space, a naive policy will give it the first, because it is the only work with tasks that are suitable. This can lead to the fact that the second work will starve indefinitely. To prevent this unfortunate situation, when the application offers space in the node, if the application cannot use it immediately, it reserves it, and no other application can be allocated a container on this node until the reservation is made. Each node may have only one reserved container. The total amount of reserved memory is specified in the ResourceManager user interface. A large number means that it may take longer to get new jobs. ,,
source share