Why didn't gcc decide to embed or not for me for this feature?

Question

Why didn't gcc decide to embed or not for me for this feature?

From some of the words on the net, I know that GCC is smart enough to decide whether to enable a feature or not. The inline is just a hint:
GCC can inline a common function and cannot inline an inline function.

But for this function in my project:

 struct vb_pos{ union{ struct{ int offset; int l; }; unsigned long long g_offset; }; }; static inline void vi_write_vtail_smart(struct vi *vi){ struct vb_pos *vhead, *vtail, *cursor; vhead = &vi->v_head; vtail = &vi->v_tail; cursor = &vi->cursor; int curoff = vi->curr - vi->lines[vi->currl].buf; cursor->offset = curoff; if(cursor->g_offset >= vhead->g_offset){ *vtail = *cursor; } else{ *vtail = *vhead; *vhead = *cursor; } }

^{compiled with -O2.}
I checked the assembler code and knew that this function was built-in, as expected.
However, when I removed its inline modifier and recompiled it, I found that it was not alone. The body of its function appeared in the final binary:

 0000000000000000 <vi_write_vtail_smart>: 0: 48 63 47 14 movslq 0x14(%rdi),%rax 4: 48 8b 17 mov (%rdi),%rdx 7: 48 8d 04 40 lea (%rax,%rax,2),%rax b: 48 8d 04 c2 lea (%rdx,%rax,8),%rax f: 48 8b 57 18 mov 0x18(%rdi),%rdx 13: 48 2b 10 sub (%rax),%rdx 16: 89 57 10 mov %edx,0x10(%rdi) 19: 48 8b 47 10 mov 0x10(%rdi),%rax 1d: 48 3b 47 38 cmp 0x38(%rdi),%rax 21: 73 0d jae 30 <vi_write_vtail_smart+0x30> 23: 48 8b 57 38 mov 0x38(%rdi),%rdx 27: 48 89 47 38 mov %rax,0x38(%rdi) 2b: 48 89 57 40 mov %rdx,0x40(%rdi) 2f: c3 retq 30: 48 89 47 40 mov %rax,0x40(%rdi) 34: c3 retq

I want to know, since GCC is smart enough, why didn't it have its own solution? Why did he execute inline when I pointed out, and not when I didn’t?

Because he did not find enough clue to make a decisive decision? or, because he has already made a decision, and his decision is this: there is not much difference with the built-in or not, and since you ask me, I am in line for you; otherwise, I leave it as a general function.

I want to know the real reason.
If this is the first case, I think we may need to reconsider the point of view (very popular on the net) at the beginning of this publication. At least GCC is not as smart as they said, and the built-in keyword is not as useless as they said.

At the end of the article, I want to add more description for the context of the code snippet above:

1, I initially want vi_write_vtail_smart() be built into the A() and B() functions, which are exported as library APIs, and both will be called frequently by users.

2, A() and B() are in the same file as vi_write_vtail_smart() .

3, vi_write_vtail_smart() used only in A() and B() , where not.

4, the body size of the function A() is about 450 bytes, B() similar.

5, A() and B() are basically simple machine code, not involving a large loop or heavy computations, and only one subfunction is used, except for vi_write_vtail_smart() . This subfunction is in another file.

6, I did a little test, I added one line of return; earlier, if (cursor-> g_offset> = vhead-> g_offset) {, (I wanted to see what happened when this function is small enough), namely

 ... int curoff = vi->curr - vi->lines[vi->currl].buf; cursor->offset = curoff; return; if(cursor->g_offset >= vhead->g_offset){ ...

And compiled without the inline modifier, and checked the assembly code ---- this time GCC entered it and its function definition disappeared from the final binary.

7, My developed environment:
ubuntu-16.04 / 64bit
gcc version 5.4.0 20160609
architecture: intel X86 Ivybridge Mobile

9, the compilation flag (you need to write again here, some missed it when reading) -O2 -std = gnu99

+5

c ++ c gcc

weiweishuo Jul 02 '17 at 23:04

source share

1 answer

AnT · Accepted Answer · 2017-07-02T23:23:52+0000

According to the GCC documentation, GCC has an optimization setting called -finline-functions . In fact, this is a parameter that forces GCC to use its heuristic criteria to include all functions, even if they are not declared inline . This option is enabled at the -O3 optimization level. So, you want to give GCC complete freedom to apply its heuristics to all functions, you need to specify at least -O3 (or explicitly specify -finline-functions ).

Without -finline-functions GCC usually does not try to embed functions that are not declared inline , with some notable exceptions: a number of other nesting options can also lead to inline functions not being enabled. However, these parameters are focused on very specific cases.

-finline-functions-called-once included -O1 still. Static functions called only once are built-in, even if they are not declared inline .
-finline-small-functions included in -O2 . It causes an attachment if it leads to a reduction in code size, even if the function is not declared inline.

Your function does not seem to pass these special overlay filters operating at the -O2 level: it is relatively large and (apparently) called more than once. For this reason, GCC does not consider it for embedding in -O2 unless you explicitly request it with the inline . Note that the explicit inline is mostly similar to the -finline-functions parameter, enabled only for this particular function. This will force GCC to consider it for embedding, but does not guarantee an attachment.

Again, if you want GCC to make these decisions completely, you need -finline-functions or -O3 . The fact that explicit inline triggers are nested in -O2 means that GCC must decide to include it in -O3 regardless of whether inline present there.

Why didn't gcc decide to embed or not for me for this feature?

More articles: