Unexpected results from OpenMP on i7 and Xeon

Question

Unexpected results from OpenMP on i7 and Xeon

parallelizing two nested for-loops, I came across a behavior that I cannot explain. I tried three different types of parallelization using OpenMP on the i7 860 and xeon E5540, and I expected the code to behave more or less the same on both platforms, which means that one of the platforms should be faster for all three different cases. I tested . But this is not so:

For case 1, Xeon is faster by ~ 10%,
for case 2, i7 is 2 times faster and
for case 3, Xeon is again faster by a factor of 1.5

Do you have any idea what might cause this?

Please indicate when you need more information or clarification!

To clarify, my question is more general. If I run the same code on i7 and on the xeon system, should I not use the OpenMP result in comparable (proportional) results?

pseudo code:

for 1:4
    for 1:1000
        vector_multiplication
    end
end

Cases:

case 1: no pramga omp no parallelzation

case 2: pragma omp for the first for the loop

case 3: pragma omp for the second cycle of the cycle

results

Here are the actual numbers from the team time:

case 1

Time   Xeon        i7
real   11m14.120s  12m53.679s
user   11m14.030s  12m46.220s
sys      0m0.080s    0m0.176s

case 2

Time   Xeon        i7
real    8m57.144s   4m37.859s
user   71m10.530s  29m07.797s
sys      0m0.300s   0m00.128s

case 3

Time   Xeon        i7
real    2m00.234s   3m35.866s
user   11m52.870s  22m10.799s
sys     0m00.170s   0m00.136s

[Update]

Thanks for all the tips. I am still studying what could be the cause.

+3

c parallel-processing openmp

Framester Nov 03 '10 at 12:38

source share

3 answers

Jonathan Dursi · Answer 1 · 2010-11-03T21:42:45+0000

.., ; . (, ) , , ; , . ?

, i7 860 , E5540 . 2 4 , 3 , , , 4- , 8 ( 8 , / ) Xeon.

, , 8 - ? , , , .

, , (, exp (sin (a)) cos (b) * cos (b) -), , , , . ( -march -xHost ) . , ( OMP_NUM_THREADS ), . , , , .

Jens Gustedt · Answer 2 · 2010-11-03T14:52:55+0000

, openMP, , . , . C99, -

#pragma omp parallel for
for (size_t i = start, is = stop; i < is; ++i) ...

, start stop . ( _Pragma, . .)

, , - ( -S).

. gcc -march=native, . . 1.

High Performance Mark · Answer 3 · 2010-11-03T16:32:57+0000

, .

, . :

, . , , , , Linux.
2 , , - OpenMP, .

, , .

, , - OpenMP- 4000 , 4 1000.

Unexpected results from OpenMP on i7 and Xeon

More articles: