See FreeMemory's answer to this question for RDTSC if you are using x86, which I tested and seems to work fine on my system (mac), but see my answer to this question . Also see Criticism of the RDTSC here .
Usually you should not go down to a too low level of detail, although other bits and pieces of work that the computer must do will use clock cycles, so they will vary depending on the load. I find omp_get_wtime() sufficient, although I need to put my code in an extra loop to make sure it takes about a second to ensure consistent run results from launch.
source share