Prediction of one function is predictable

I need a better way to profile numeric code. Suppose I use GCC in Cygwin on a 64-bit x86 and that I am not going to buy a commercial tool.

The situation is as follows. I have one function working in one thread. There are no code or I / O dependencies outside of memory access, with the possible exception of some associated math libraries. But for the most part, these are all tables, index calculations, and numerical processing. I cached all arrays on the heap and stack. Due to the complexity of the algorithm (s), loop unfolding, and long macros, the list of assemblies can become quite long - thousands of instructions.

I use either the tic / toc timer in Matlab, the time utility in the bash shell, or using the time stamp counter (rdtsc) directly around the function. The problem is this: variance (which can be up to 20% of the runtime) is longer than the size of the improvements I make , so I don’t know if the code is better or worse after the change. You might think that the time has come to refuse. But I would not agree. If you are constant, many additional improvements can lead to a productivity increase of two or three times.

One problem that I experienced several times, especially insanely, is that I am making changes, and the performance seems to improve sequentially, say, by 20%. The next day, the winnings are lost. Now, maybe I did what I thought was a harmless code change, and then completely forgot about it. But I wonder if something else is possible? How, perhaps, GCC does not give a 100% deterministic result, as I believe. Or maybe this is something simpler, for example, the OS moved my process to a more busy core.

, , - . , , , . , , .

  • ​​ .
  • ( ).
  • , DLL . , , , - / .
  • - ( ).
  • ? ? - , , , .

RISC, . , , . (, TI Code Composer C67x) , ALU.

, GCC/GAS, . . , . , , , , x86.

gcov , - GCC, , MinGW, .

, , , , .


(RE: )

, , ? , Visual Studio . DLL, GCC Cygwin. mex DLL, Matlab VS2013.

, Matlab, , . , DLL Matlab , , .

, GCC, , , Microsoft. . , Microsoft, , , C (C99). , , GCC , , , . , .

, , . ; , . , , - ( ). . . , . , . , , , , . , , , . , , . , .

, . , - , . , . .

, , . 10% - . 10% - . 10% - . ? , . , , , . , - , , . . , , , .

, . AVX . , , parallelism. , , . , 256- . , .

, , , , , , , , .

, , ( , , ), " " , , ?

+4
2

, , , - , , , , , ?

-, , . , ( , , ), , , . , 30% - -, . . . , . , . , : ", , , , ",

. , . " X, Y% ", ", , Z ", . , , , ( , ). - , , .

, , , , , , .

. , powerpoint, , , , , .

+1

, Linux, . , ( " " ), , 5-10% SPEC2000 ( Windows - ).

1% :

  • ( BIOS, Linux, )
  • , "Turbo boost" .. (BIOS, )
  • , ( ​​0 - - , )
  • ( ) - systemd
  • ASLR
  • drop pagecache

, 1% .

github , .

- EDIT -

script .

+2

Source: https://habr.com/ru/post/1661872/


All Articles