Is it cutting if statements using function pointers are more efficient?

So, here is the rule to try to deduce if from highly repeatable loops:

 for( int i = 0 ; i < 10000 ; i++ ) { if( someModeSettingOn ) doThis( data[i] ) ; else doThat( data[i] ) ; } 

They say it's better to break it down to put an if statement:

 if( someModeSettingOn ) for( int i = 0 ; i < 10000 ; i++ ) doThis( data[i] ) ; else for( int i = 0 ; i < 10000 ; i++ ) doThat( data[i] ) ; 

(In case you say: β€œHo! Do not optimize it yourself! The compiler will do it!”) Of course, the optimizer can do this for you. But in Typical C ++ Bullshit (which I disagree with all its points, for example, its relation to virtual functions) Mike Acton says: β€œWhy do the compiler guess something you know? A very good point of these sticky ones, for me.

So why not use a function pointer?

 FunctionPointer *fp ; if( someModeSettingOn ) fp = func1 ; else fp = func2 ; for( int i = 0 ; i < 10000 ; i++ ) { fp( data[i] ) ; } 

Is there any hidden overhead for function pointers? Is it efficient as a direct function call?

+6
source share
8 answers

In this example, it is impossible to say which case will be faster. You should comment this code on the target platform / compiler in order to evaluate it.

And in general, in 99% of cases, such a code does not need to be optimized. This is an example of evil premature optimization. Write a human-readable code and optimize it only if necessary after profiling.

+8
source

Do not guess, measure .

But, if I absolutely had to guess, I would say that the third option (function pointer) will be slower than the second option ( if external loops), which, I suspect, could play better with CPU branch prediction.

The first option may or may not be equivalent to the second, depending on how smart the compiler is, as you have already noted.

+6
source

Why would the compiler guess what you know?

Because you can complicate the code for future maintainers without bringing any tangible benefits to users of your code. This change smells like premature optimization, and only after profiling would I consider anything other than the obvious ( if inside the loop).

Given that profiling shows that this is a problem, as an assumption, I believe that pulling an if from a loop will be faster than a function pointer, because a pointer can add a level of indirection that the compiler cannot optimize far. It will also reduce the likelihood that the compiler will be able to make any calls.

However, I would also consider an alternative design using an abstract interface instead of an if inside the loop. Then each data object already knows what to do automatically.

+6
source

My bet would be on the second version the fastest with if/else outside the loop, provided that I get a refund when we link and test it in the widest range of compilers.: - D I have been making this bet with quite a lot of years since VTune in the hand.

However, I would be happy if I lost the bet. I think it is very possible that many compilers can now optimize the first version to compete with the second, finding that you repeatedly check a variable that does not change inside the loop, and therefore effectively raises branching that occurs outside the loop.

However, I have not yet met the case when I saw that the optimizer performs the same equivalent of embedding an invocation of an indirect function ... although, if there were a case where the optimizer could do this, then your definitely would be the easiest, as it assigns addresses of functions to call in the same function in which it calls these functions through function pointers. I would be very pleasantly surprised if optimizers can do it now, especially because I like your third version in terms of maintenance (the easiest way is to change if we want to add new conditions that lead to different functions for calling, for example).

Nevertheless, if it is not embedded, then the solution of the function pointer will tend to be the most expensive not only because of the long jump, but also, possibly, additional stack leaks, etc., as well as the lack of an optimizer. There is an optimizer barrier when it does not know which function is called through the pointer. At this point, he can no longer combine all this information in IR and do the best job of selecting commands, register allocation, etc. This aspect of the layout for indirect function calls is not discussed often, but is potentially the most expensive part of a function call indirectly.

+2
source

Not sure if it qualifies as β€œhidden,” but of course, using a function pointer requires another level of indirection.

The compiler must generate code to dereference the pointer, and then go to the resulting address, and not to the code that directly goes to the permanent address, for a normal function call.

+1
source

You have three cases:

If inside the loop, de-ref function pointer inside the loop, if outside the loop.

Of the three, without any generalized optimization, the third will be the best. The first makes a conditional, and the second makes a de-reference pointer on top of the code that you want to run, and the third just runs what you want.

If you want to optimize yourself, DO NOT follow the function pointer version! If you do not trust the compiler for optimization, then an additional indirect relation may cost you money, and in the future it will be much easier to do (in my opinion).

+1
source

You should measure which is faster, but I highly doubt that the response of the function pointer will be faster. Flag checking probalby has zero latency for modern processors with deep multiple pipelines. While a pointer to a function makes it likely that the compiler will be forced to make the actual function call by pressing registers, etc.

"Why would the compiler guess what you know?"

Both you and the compiler know some things at compile time, but the processor knows even more things at run time β€” for example, if there are empty pipelines in this inner loop. Days of this kind of optimization go beyond embedded systems and graphics shaders.

+1
source

The rest all raise very important points, especially what you have to measure. I want to add three things:

  • An important aspect is that using function pointers often inhibits embedding, which can lead to poor performance of your code. But it definitely depends. Try playing with the godbolt compiler explorer and look at the build:

    https://godbolt.org/g/85ZzpK

    Note that when doThis and doThat not defined, for example. how this could happen across DSO boundaries will not make much difference.

  • The second point is related to branch prediction. Take a look at https://danluu.com/branch-prediction/ . He should clearly indicate that the code you have here is actually ideal for a predictor of the industry, and perhaps you don’t need to worry. Again, a good profiler such as perf or VTune will tell you whether you suffer from wrong prejudices or not.

  • Finally, there was at least one scenario that I saw when the rise of conditional forms from the cycle was of great importance, despite the above reasoning. This was in a complex mathematical cycle that was not automatically autogenized due to conventions. GCC and Clang can report on which cycle the vectorization gets, or why it wasn’t. In my case, the conditional question was really a problem for the autovectorizer. That was the case with GCC 4.8, so the situation has changed since then. With Godbolt, it's pretty easy to check if this is a problem for you. Again, always measure on your target machine and see if you are affected.

+1
source

Source: https://habr.com/ru/post/918481/


All Articles