MPI_Reduce lock (or natural barrier)?

I have a code snippet below in C ++ that basically calculates pi using the classic monte carlo technique.

srand48((unsigned)time(0) + my_rank); for(int i = 0 ; i < part_points; i++) { double x = drand48(); double y = drand48(); if( (pow(x,2)+pow(y,2)) < 1){ ++count; } } MPI_Reduce(&count, &total_hits, 1, MPI_DOUBLE, MPI_SUM, 0, MPI_COMM_WORLD); MPI_Barrier(MPI_COMM_WORLD); if(my_rank == root) { pi = 4*(total_hits/(double)total_points); cout << "Calculated pi: " << pi << " in " << end_time-start_time << endl; } 

I'm just wondering if an MPI_Barrier call is needed. MPI_Reduce make sure that the body of the if statement is not executed before the reduction operation is completed? I hope I get it. Thanks

+6
source share
3 answers

Yes, all collective calls (Reduction, Scattering, Collection, etc.) are blocked. No need for barriers.

+7
source

Blocking yes, barrier, no. It is very important to call MPI_Barrier() for MPI_Reduce() when executed in a narrow loop. If you do not call MPI_Barrier() , the receive buffers for the recovery process will eventually work fully, and the application will be interrupted. While other participating processes need to be sent and continued, the reduction process must be received and reduced. The above code does not need barriers if my_rank == root == 0 (which is probably true). In any case ... MPI_Reduce() does not execute the barrier or any form of synchronization. AFAIK even MPI_Allreduce() does not guarantee synchronization (at least not according to the MPI standard).

+2
source

Ask yourself if this barrier is needed. Suppose you are not a root; you invoke Reduce, which sends your data. Is there a reason to sit and wait until the root gets the result? Answer: no, so you do not need a barrier.

Suppose you are the root. You are invoking a decrease call. Semantically, you are forced to sit and wait until the result is fully assembled. So why the barrier? Again, no barrier call is required.

In general, you almost never need barriers because you do not need time synchronization. Semantics ensure that your local condition is correct after calling the abbreviation.

0
source

Source: https://habr.com/ru/post/908466/


All Articles