I did some tests of time and efficiency and came across unexpected behavior. I found that my program actually works faster if I started other background processes that linked all CPU cores to 100%. Here is a simplified sample program:
#define _XOPEN_SOURCE 600
I am working on a rather old Pentium 4 @ 2.8GHz with Hyper Threading enabled, which appears as two processors in / proc / cpuinfo.
Output with a system with respect to idling:
$ ./test Done, cpu time: 11.450000
And now we load all the kernels:
$ md5sum /dev/zero& ./test; killall md5sum Done, cpu time: 8.930000
This result is consistent. I suppose that I somehow improved the efficiency of the cache by reducing the amount of time that the program moved to another processor, but this is just a shot in the dark. Can anyone confirm or deny this?
Secondary question: I was surprised to find that cpu_time can vary greatly from run to run. The method used above is taken directly from the GNU C manual , and I thought that using clock() would protect me from time fluctuations due to other processes using the CPU. Obviously, based on the above results, this is not so. So my secondary question is: is the clock() method the right way to measure performance?
Update: I reviewed the suggestions in the comments on the processor frequency scaling knob, and I don't think what is going on here. I tried to control the processor speed in real time through watch grep \"cpu MHz\" /proc/cpuinfo (as suggested here ), and I do not see the frequency changes while the program is running. I should also include in my post that I am running a rather old kernel: 2.6.25.
Update 2: I started using the script below to play around with the number of md5sum processes running. Even when I run more processes than the logical processor, it runs faster than it runs autonomously.
Update 3: If you disable Hyper Threading in the BIOS, this strange behavior will disappear, and startup always takes about 11 seconds of processor time. Hyper Threading seems to be relevant.
Update 4: I just ran this on an Intel Xeon @ 2.5GHz dual-core processor and did not see any of these strange activities. This “problem” can be quite specific to my particular hardware setup.
#!/bin/bash declare -i num=$1 for (( num; num; num-- )); do md5sum /dev/zero & done time ./test killall md5sum
-
$ ./run_test.sh 5 Done, cpu time: 9.070000 real 0m27.738s user 0m9.021s sys 0m0.052s $ ./run_test.sh 2 Done, cpu time: 9.240000 real 0m15.297s user 0m9.169s sys 0m0.080s $ ./run_test.sh 0 Done, cpu time: 11.040000 real 0m11.041s user 0m11.041s sys 0m0.004s