How to find out how much work was done in OpenMP "for the directive"?

I would like to know the progress of the for loop using OpenMP. I know that the reduction directive does not work, but I wrote like this:

#pragma omp for reduction (+:sum) for (int i=0; i < size; i++){ // do something that takes about 10seconds sum++; #pragma omp critical cout << sum << " / " << size << endl; } 

this will return something like this:

 1 / 100 1 / 100 2 / 100 1 / 100 ... 

but I want this:

 1 / 100 2 / 100 3 / 100 

...

Is there a way to get the correct sum value during the reduction directive? or should i use a different method?

+4
source share
3 answers

The reduction clause has a very well-defined meaning, explained in detail in section 2.9.3.6 of the latest OpenMP standard . I doubt that you will be able to use it for the purposes described above.

In any case, it is possible to implement this behavior with minor changes in your source:

 sum = 0 #pragma omp for shared(sum) schedule(guided) for (int i=0; i < size; i++){ // do something that takes about 10seconds #pragma omp critical(PRINT) { sum++; cout << sum << " / " << size << endl; } } 

This way you guarantee that only one thread at a time tries to increase the "amount" and print it on the screen. Given the length of each iteration, this synchronization should not cause performance issues.

0
source

You must use a different method. Reduction creates a thread-dependent variable (in your case, sum ), which only decreases at the end when all threads are connected. The reduction is highly implementation dependent. It could wait for all threads to complete, it might decrease as threads complete, it can create a recovery tree, etc.

Instead, to track your progress, you might have another numDone variable, each of which will increase numDone .

EDIT

Wikipedia explains this pretty well:

abbreviation (operator | intrinsic: list): the variable has a local copy in each stream, but the values ​​of local copies will be summarized (abbreviated) into a global shared variable.

0
source

To avoid the need for communication (from updating the total counter), you can simply print the stream number along with the number of elements that it has processed so far, i.e.

 #pragma omp parallel { int count = 0; #pragma omp for schedule(dynamic) // or whatever schedule you want for(int i=0; i<size; ++i) { // ... printf("@ %d: done %d loops\n", omp_get_thread_num(),++count); // should not need a critical section } } 

In your specific case, since the work takes about 10 seconds, any connection is not critical, but it may be advisable to use a dynamic schedule, in particular, if the work can vary between different i .

0
source

Source: https://habr.com/ru/post/1434343/


All Articles