This post, although somewhat outdated, I claim to have accepted the answer. The statement “These functions do not return until the connection is completed” is a little misleading, because blocking the connection does not guarantee any handshake in the send and receive operations.
First, you need to know that sending has four communication modes : Standard, Buffered, Synchronous, and Ready, and each of them can be blocking and non-blocking.
Unlike sending, receiving has only one mode and can be blocking or non-blocking .
Before continuing, it is also necessary to clarify that I explicitly mention which one is the MPI_Send \ Recv buffer, and which is the system buffer (which is the local buffer in each processor belonging to the MPI library used to move data between the communication groups)
COMMUNICATION BLOCKING : Blocking does not mean that the message has been delivered to the recipient / recipient. It just means that the buffer (send or receive) is reusable. To reuse the buffer, it is enough to copy the information to another memory area, that is, the library can copy the buffer data to its own memory location in the library, and then, for example, MPI_Send can return.
The MPI standard allows you to very clearly separate message buffering from send and receive operations. Blocking sending can be completed as soon as the message has been buffered, even if no matching messages have been sent. But in some cases, message buffering can be expensive, and therefore, direct copying from the send buffer to the receive buffer can be effective. Consequently, the MPI Standard provides four different send modes to give the user some freedom in choosing the appropriate send mode for his application. Let's see what happens in each communication mode:
1. Standard mode
In standard mode, it depends on the MPI library whether or not to buffer the outgoing message. In the event that the library decides to buffer the outgoing message, the sending can be completed even before the corresponding reception has been called. In the case when the library decides not to buffer (for performance reasons or due to the inaccessibility of the buffer space), the send will not be returned until the corresponding receive has been sent and the data in the send buffer has been moved to the receive buffer.
Thus, MPI_Send in standard mode is non-local in the sense that sending in standard mode can be started regardless of whether the corresponding reception was published, and its successful completion may depend on the appearance of a matching reception (due to the fact that it the implementation depends on whether the message is buffered or not).
The syntax for standard sending is below:
int MPI_Send(const void *buf, int count, MPI_Datatype datatype, int dest, int tag, MPI_Comm comm)
2. Buffered mode
As in standard mode, sending in buffered mode can be started regardless of the fact that the sending of the corresponding receipt was sent, and the sending can be completed before the corresponding reception is published. However, the main difference is that if the sending is monitored and the corresponding response is not published, the outgoing message should be buffered. Please note that if a corresponding receive has been published, buffered sending may successfully meet the processor that started receiving, but if there is no receiving, sending in buffered mode should buffer the outgoing message to allow the sending to complete. In general, buffered sending is local . The distribution of the buffer in this case is determined by the user, and in case of insufficient space in the buffer, an error occurs.
The syntax for the send buffer is:
int MPI_Bsend(const void *buf, int count, MPI_Datatype datatype, int dest, int tag, MPI_Comm comm)
3. Synchronous mode
In synchronous sending mode, sending can be started regardless of whether the corresponding reception has been published. However, the sending will be successfully completed only if the corresponding sending was published, and the recipient began to receive a message sent by synchronous sending. The completion of synchronous sending indicates not only that the buffer in the sending can be reused, but also the fact that the receiving process began to receive data. If both sending and receiving are blocked, then the connection does not end at either end, until the connecting processor approaches.
The syntax for synchronous sending is:
int MPI_Ssend(const void *buf, int count, MPI_Datatype datatype, int dest, int tag, MPI_Comm comm)
4. Ready mode
Unlike the previous three modes, sending in ready mode can only be started if the corresponding reception has already been published. The completion of sending does not say anything about the corresponding receipt, but simply says that the send buffer can be reused. A dispatch using the ready mode has the same semantics as the standard mode, or synchronous mode with additional information about the corresponding reception. The correct program with a ready communication mode can be replaced by synchronous or standard sending without affecting the result, except for the difference in performance.
The syntax for the finished send is:
int MPI_Rsend(const void *buf, int count, MPI_Datatype datatype, int dest, int tag, MPI_Comm comm)
After going through all 4 sending locks, they may seem fundamentally different, but depending on the implementation, the semantics of one mode may be similar to another.
For example, MPI_Send is generally a blocking mode, but depending on the implementation, if the message size is not too large, MPI_Send will copy the outgoing message from the send buffer to the system buffer ("which in most cases takes place in a modern system) and will return immediately. Let's look at the example below:
//assume there are 4 processors numbered from 0 to 3 if(rank==0){ tag=2; MPI_Send(&send_buff1, 1, MPI_DOUBLE, 1, tag, MPI_COMM_WORLD); MPI_Send(&send_buff2, 1, MPI_DOUBLE, 2, tag, MPI_COMM_WORLD); MPI_Recv(&recv_buff1, MPI_FLOAT, 3, 5, MPI_COMM_WORLD); MPI_Recv(&recv_buff2, MPI_INT, 1, 10, MPI_COMM_WORLD); } else if(rank==1){ tag = 10; //receive statement missing, nothing received from proc 0 MPI_Send(&send_buff3, 1, MPI_INT, 0, tag, MPI_COMM_WORLD); MPI_Send(&send_buff3, 1, MPI_INT, 3, tag, MPI_COMM_WORLD); } else if(rank==2){ MPI_Recv(&recv_buff, 1, MPI_DOUBLE, 0, 2, MPI_COMM_WORLD); //do something with receive buffer } else{ //if rank == 3 MPI_Send(send_buff, 1, MPI_FLOAT, 0, 5, MPI_COMM_WORLD); MPI_Recv(recv_buff, 1, MPI_INT, 1, 10, MPI_COMM_WORLD); }
Let's see what happens at each level in the example above.
He tries to send rank 0 to rank 1 and rank 2, and to receive from rank 1 and 3.
Rank 1 tries to send to rank 0 and rank 3 and does not receive anything from other ranks
Rank 2 tries to get from rank 0, and then perform some operation on the data received in recv_buff.
Rank 3 tries to send to rank 0 and get with rank 1
Beginners are embarrassed when rank 0 is sent to rank 1, but rank 1 did not start any receive operation, so communication should be blocked or stopped, and the second send statement at rank 0 should not be executed at all (and this is exactly what MPI is). The documentation emphasizes that it is the implementation that determines whether the outgoing message is buffered or not). On most modern systems, such messages of small sizes (here the size is 1) will be easily buffered, and MPI_Send will return and execute it in the next MPI_Send statement. Therefore, in the above example, even if the reception in rank 1 is not started, the 1st MPI_Send in rank 0 will return and execute its next statement.
In a hypothetical situation, when rank 3 starts execution up to rank 0, it will copy the outgoing message in the first send statement from the send buffer to the system buffer (in a modern system;)), and then start executing its receive statement. As soon as rank 0 completes its two send statements and begins to execute its receive statement, data buffered in the system with rank 3 is copied to the receive buffer with rank 0.
In the event that a receive operation has been started in the processor and the corresponding sending has not been published, the process will be blocked until the receive buffer is filled with the expected data. In this situation, calculations or other MPI communications will be blocked / stopped if MPI_Recv has not returned.
Having understood the phenomenon of buffering, you need to go back and think more about MPI_Ssend, which has the true semantics of blocking communications. Even if MPI_Ssend copies the outgoing message from the send buffer to the system buffer (which is again determined by the implementation), it should be noted that MPI_Ssend will not return if any confirmation (in a low-level format) from the receiving process is not received by the sending processor.
Fortunately, MPI decided to make things easier for users in terms of reception, and there is only one trick in blocking exchanges: MPI_Recv , and it can be used with any of the four sending modes described above. For MPI_Recv, a lock means that a receive is returned only after it contains data in its buffer. This implies that receipt can only be completed after the start of the corresponding dispatch, but does not imply whether it can be completed before the completion of the corresponding dispatch.
During such blocking calls, what happens is that the calculations stop until the locked buffer is freed. This usually leads to a loss of computing resources, since Send / Recv usually copies data from one memory location to another memory location, while the registers in the processor remain inactive.
NON-BLOCKING COMMUNICATION : For non-blocking communication, the application creates a data exchange request for sending and / or receiving and returns a handle and then terminates. This is all it takes to ensure that the process is running. Those. The MPI library is notified that the operation must be completed.
For the sender, this allows you to combine computing with communication.
For the recipient side, this allows you to overlap part of the communication overhead, that is, copy the message directly to the address space of the receiving party in the application.