Difference between section and openmp task

What is the difference between OpenMP between:

#pragma omp parallel sections { #pragma omp section { fct1(); } #pragma omp section { fct2(); } } 

and:

 #pragma omp parallel { #pragma omp single { #pragma omp task fct1(); #pragma omp task fct2(); } } 

I'm not sure if the second code is correct ...

+44
c parallel-processing openmp
Dec 09
source share
1 answer

The difference between tasks and sections is that time frame in which the code will be executed. Sections are enclosed in the sections construct and (unless the nowait clause is nowait ) the threads will not leave it until all sections have been executed:

  [ sections ] Thread 0: -------< section 1 >---->*------ Thread 1: -------< section 2 >*------ Thread 2: ------------------------>*------ ... * Thread N-1: ---------------------->*------ 

Here N threads meet with the construction of sections with two sections, the second takes longer than the first. The first two threads perform one section. Other N-2 tags just wait on the implicit barrier at the end of the section design (show here as * ).

Tasks are queued and executed when possible at the so-called task planning points. In some circumstances, runtime may be allowed to move a task between threads, even in the middle of their lifetime. Such tasks are called untied, and an unoccupied task can start execution in one thread, and then at some planning point it can be transferred by the runtime to another thread.

Nevertheless, the tasks and sections are much similar. For example, the following two code snippets achieve almost the same result:

 // sections ... #pragma omp sections { #pragma omp section foo(); #pragma omp section bar(); } ... // tasks ... #pragma omp single nowait { #pragma omp task foo(); #pragma omp task bar(); } #pragma omp taskwait ... 

taskwait works very much like a barrier , but for tasks, it ensures that the current thread of execution is paused until all tasks in the queue are completed. This is a planning point, that is, it allows threads to handle tasks. The single construct is required so that tasks are created by only one thread. If there wasn’t a single construct, each task would create num_threads times, which might not be what you needed. The nowait in the single construct tells other threads not to wait for the single construct to complete (i.e., removes the implicit barrier at the end of the single construct). Therefore, they immediately click taskwait and begin processing tasks.

taskwait is the explicit planning point shown here for clarity. There are also implicit planning points, primarily within barrier synchronization, whether explicit or implicit. Therefore, the above code can also be written as follows:

 // tasks ... #pragma omp single { #pragma omp task foo(); #pragma omp task bar(); } ... 

Here is one possible scenario of what might happen if there are three threads:

  +--+-->[ task queue ]--+ | | | | | +-----------+ | | | Thread 0: --< single >-| v |----- Thread 1: -------->|< foo() >|----- Thread 2: -------->|< bar() >|----- 

Show here in | ... | | ... | - This is the action of the planning point (either the taskwait directive or the implicit barrier). Basically, threads 1 and 2 suspend what they are doing at this point and start processing tasks from the queue. When all tasks are processed, threads resume their normal execution thread. Note that threads 1 and 2 can reach the planning point before thread 0 completes the single construct, so the left | does not have to be aligned (this is shown in the diagram above).

It may also happen that thread 1 can finish processing the task foo() and request another before other threads can request tasks. Thus, both foo() and bar() can be executed by the same thread:

  +--+-->[ task queue ]--+ | | | | | +------------+ | | | Thread 0: --< single >-| v |--- Thread 1: --------->|< foo() >< bar() >|--- Thread 2: --------------------->| |--- 

It is also possible that a dedicated thread can perform a second task if thread 2 arrives too late:

  +--+-->[ task queue ]--+ | | | | | +------------+ | | | Thread 0: --< single >-| v < bar() >|--- Thread 1: --------->|< foo() > |--- Thread 2: ----------------->| |--- 

In some cases, the OpenMP compiler or runtime can even completely bypass the task queue and execute tasks one at a time:

 Thread 0: --< single: foo(); bar() >*--- Thread 1: ------------------------->*--- Thread 2: ------------------------->*--- 

If there are no task planning points in the region code, the OpenMP runtime can run tasks when it sees fit. For example, it is possible that all tasks are delayed until a barrier is reached at the end of the parallel .

+99
Dec 09 '12 at 16:08
source share



All Articles