You must use a different method. Reduction creates a thread-dependent variable (in your case, sum ), which only decreases at the end when all threads are connected. The reduction is highly implementation dependent. It could wait for all threads to complete, it might decrease as threads complete, it can create a recovery tree, etc.
Instead, to track your progress, you might have another numDone variable, each of which will increase numDone .
EDIT
Wikipedia explains this pretty well:
abbreviation (operator | intrinsic: list): the variable has a local copy in each stream, but the values ββof local copies will be summarized (abbreviated) into a global shared variable.
source share