I am looking for ways to run micro tests on multi-core processors.
Context:
Around the same time, desktop processors introduced out-of-order execution, which had a big impact on performance; they, perhaps not by accident, also introduced special instructions for obtaining very accurate timings. An example of these instructions is rdtsc on x86 and rftb on PowerPC. These instructions gave timings that were more accurate than ever allowing a system call, allowing programmers to microcontrol their hearts, better or worse.
On an even more modern processor with several cores, some of which sleep for a while, the counters are not synchronized between the cores. We are told that rdtsc no longer safe to use for benchmarking, but I must have been napping when alternative solutions were explained to us.
Question:
Some systems may save and restore a performance counter and provide an API call to read the correct amount. If you know that this call is for any operating system, let us know in response.
Some systems may allow you to disable the kernel, leaving only one start. I know what Mac OS X Leopard does when the right preferences panel is installed in the Developer Tools. Do you think that making rdtsc safe for use again?
More context:
Suppose I know what I'm doing when trying to make a micro benchmark. If you think that if optimization cannot be measured by time for the entire application, itβs not worth optimizing, I agree with you, but
I cannot start the whole application until the alternative data structure is finished, which will take a lot of time. In fact, if the micro benchmark was not promising, I would decide to abandon the implementation now;
I need numbers to publish in a publication whose term I do not control.
source share