In general, you may have two situations:
- Your problem is small enough to fit into the memory of your single system, and your only system has sufficient processor power to solve the problem within the required time.
- Your problem is too big. 2.1 Duration of operation is too long (IO disk and / or processor time) 2.2 Too long to fit into memory (RAM).
In 2.1 and 2.2, the MapReduce paradigm helps break down work into many small pieces. If you need more CPUs, you just add processors.
So, if you have one system, and it turns out that your problem is too big to fit into memory (paragraph 2.2), you can still take advantage of the fact that MapReduce can easily put part of the problem on disk until that part for processing.
An important fact is that if you have a problem that is small enough to fit into memory and small enough to process on one system, then a dedicated (non-MapReduce) solution can be much faster.
source share