If you are familiar with threads, you treat each node as a stream (before expansion)
You send a message (work) to node and it does some work and then returns some results.
Similar behavior between stream and MPI:
All of them are connected with the division of work and its processing separately.
All of them will have overhead, if there are more node / threads involved, MPI overhead is more significant than the flow, message passing around the nodes can lead to significant overhead, if the work is not carefully partitioned, you may encounter messages about time transfer> computational time required to process a job.
Differential behavior:
They have different memory models, each MPI node does not share memory with others and does not know anything about the rest of the world unless you send something to it.
source share