If “OpenMP parallel spiking,” which I would call “parallel overhead,” is troubling in your loop, it means you probably don't have enough workload to parallelize . Parallelization gives acceleration only if you have a sufficient problem size. You have already shown an extreme example: there is no work in a parallel loop. In this case, you will observe greatly changing times due to parallel overhead.
The parallel overhead in OpenMP omp parallel for includes several factors:
- First,
omp parallel for is the sum of omp parallel and omp for . - Overhead spawning or waking streams (many OpenMP implementations will not create / destroy every
omp parallel . - As for
omp for , the overhead of (a) scheduling workflows for workflows, (b) scheduling (especially if dynamic scheduling is used). - The overhead of the implicit barrier at the end of
omp parallel , if nowait not specified.
FYI, to measure OpenMP concurrent overhead, would be more efficient:
double measureOverhead(int tripCount) { static const size_t TIMES = 10000; int sum = 0; int startTime = clock(); for (size_t k = 0; k < TIMES; ++k) { for (int i = 0; i < tripCount; ++i) { sum += i; } } int elapsedTime = clock() - startTime; int startTime2 = clock(); for (size_t k = 0; k < TIMES; ++k) { #pragma omp parallel for private(sum)
Try to run such a small code, perhaps once, and then take the average. Also, put at least a minimum workload in cycles. In the above code, parallelOverhead represents the approximate overhead of the OpenMP omp parallel for construct.
source share