The compiler cannot embed the call in foo1 because the call does not use the constant call to compile-time. If he knows that the constant argument was passed to foo1 at compile time by inserting it into a line, it will be built into the correct function.
Consider the following example:
namespace{ template<int N> int bar(){ return N; } int foo1(int n){ if(n < 0 || n > 5){ __builtin_unreachable(); } #if __clang__ __builtin_assume(n >= 0 && n <= 5); #endif static int (* const fns[])() = { bar<0>, bar<1>, bar<2>, bar<3>, bar<4>, bar<5> }; return fns[n](); } } int main(int argc, char** argv){ int n = foo1(3); return n; }
It is compiled into the following code by two compilers:
main: mov eax, 3 ret
In the case of foo2, the compiler starts with 5 different calls with constant callers, all of which they are built. He then optimizes the resulting code further, generating his own jump table, if he considers it profitable.
I assume that the compiler can try to extract the key from the jump table and then add everything, but it will be rather complicated and unlikely to give a performance improvement in the general case, so neither gcc nor clang seem to do this.
Paulr source share