Your program may be too fast to allow omp_get_wtime . If you only want to measure time and not care about the final content of mZ, you can repeat the test several times and divide the final number by the number of repetitions:
#define REPS 1024 ... ... double acumtime = 0.0; for (rep = 0; rep < REPS; rep++) { double start = omp_get_wtime(); #pragma omp parallel for schedule(dynamic,3) private(i) num_threads(nthreads) for(i=0 ; i<SIZE ; i++) { for(k=0 ; k<SIZE ; k++) { mZ[i][k]=mX[i][k]+mY[i][k]; printf("Thread no %d \t [%d] [%d] result: %d\n", omp_get_thread_num(),i,k, mZ[i][k]); } } acumtime += omp_get_wtime()-start; } printf ("Elapsed time is: %f\n", acumtime/REPS);
You can also disable printf's inside a parallel block, as this can be a serious cause of slowdown.
source share