I have a complicated program that uses std::array<double, N> for small values ββof N. It uses operator[] to get values ββfrom these arrays.
I found that GCC 6.1 with -O2 or -O3 does not build these calls, which leads to the fact that these C ++ arrays are slower than their C equivalents.
Here is the generated assembly:
340 <std::array<double, 8ul>::operator[](unsigned long) const>: 340: 48 8d 04 f7 lea (%rdi,%rsi,8),%rax 344: c3 retq 345: 90 nop 346: 66 2e 0f 1f 84 00 00 nopw %cs:0x0(%rax,%rax,1) 34d: 00 00 00
The same code is emitted for each array size (since border checking is not performed).
The loop over such an array is as follows:
4c0: e8 7b fe ff ff callq 340 <std::array<double, 8ul>::operator[](unsigned long) const> 4c5: be 07 00 00 00 mov $0x7,%esi 4ca: 4c 89 f7 mov %r14,%rdi 4cd: 48 89 44 24 78 mov %rax,0x78(%rsp) ...6 more copies of this... 4d2: e8 69 fe ff ff callq 340 <std::array<double, 8ul>::operator[](unsigned long) const> 4d7: 48 89 44 24 70 mov %rax,0x70(%rsp) 4dc: 31 f6 xor %esi,%esi 4de: 4c 89 ef mov %r13,%rdi
This seems obviously bad. The problem is that small test programs do not cause this behavior.
So my question is: how can I get GCC to tell me why it does not embed these calls with a single instruction and / or does not make them inline? Obviously, I cannot change the header file <array> to add __attribute__((inline)) .