Which loops should be parallelized, external or internal

I am writing an image processing filter and I want to speed up calculations using openmp. My pseudo-code structure is as follows:

for(every pixel in the image){
    //do some stuff here
    for(any combination of parameters){
        //do other stuff here and filter
    }
}

The code filters each pixel using different parameters and selects the optimal ones.

My question is what is faster: parallelize the first cycle between processors or sequentially get pixels and parallelize different parameters.

I think the question may be more general: which is faster, giving a large number of operations for each thread or creating many threads with several operations.

Now I don't care about implementation details, and I think I can handle them from my previous experience using openmp. Thanks!

+4
3

- . ( ) ​​. parallelism, , . , .

+4

. , , "" .

, , , . , , , . .

+4

,

, .

: /, , .

+3
source

Source: https://habr.com/ru/post/1527523/


All Articles