Splitting OpenMP streams into an unbalanced tree

I am trying to do tree-like operations, for example, to sum the numbers in all sheets in a tree in parallel using OpenMP. The problem I am facing is that the tree I'm working on is unbalanced (the number of children changes, and then how the big branches change).

I currently have recursive functions working on these trees. I am trying to achieve this:

1) Separate the threads as soon as possible, say that node with 2 children

2) Continue dividing from both resulting streams into at least 2-3 levels so that all streams work.

It will look like this:

if (node->depth <= 3) {
    #pragma omp parallel
    {
        #pragma omp schedule(dynamic)
        for (int i = 0; i < node->children_no; i++) {
            int local_sum;

            local_sum = sum_numbers(node->children[i])
            #pragma omp critical
            {
                global_sum += local_sum;
            }
        }
    }
} else {
    /*run the for loop without parallel region*/
}

The problem is that when I enable nested parallelism, it seems that OpenMP creates many threads in new commands. I would like to achieve this:

1) , , , MAX_THREADS

2) for , , , , .

, , , , , , , .

, . , , , , ?

+4
1

, High Performance Mark (, ). OpenMP parallelism, , ( , , Vampir, Paraver / HPCToolkit).

if (node->depth <= 3) {
    #pragma omp parallel shared (global_sum)
    {
        for (int i = 0; i < node->children_no; i++) {
            int local_sum;

            #pragma omp single
            #pragma omp task
            {
              local_sum = sum_numbers(node->children[i])

              #pragma omp critical
              global_sum += local_sum;
            }
        }
    }
} else {
    /*run the for loop without parallel region*/
}
+2

Source: https://habr.com/ru/post/1621666/


All Articles