I am writing an image processing filter and I want to speed up calculations using openmp. My pseudo-code structure is as follows:
for(every pixel in the image){
//do some stuff here
for(any combination of parameters){
//do other stuff here and filter
}
}
The code filters each pixel using different parameters and selects the optimal ones.
My question is what is faster: parallelize the first cycle between processors or sequentially get pixels and parallelize different parameters.
I think the question may be more general: which is faster, giving a large number of operations for each thread or creating many threads with several operations.
Now I don't care about implementation details, and I think I can handle them from my previous experience using openmp. Thanks!