Gcc -O0 is still optimizing "unused" code. Is there a compilation flag to change this?

As I raised in this question , gcc removes (yes, with -O0 ) the line of code _mm_div_ss(s1, s2); presumably because the result is not saved, however, this should raise a floating point exception and raise SIGFPE, which cannot happen if the call is deleted.

Question : Is there a flag or several flags to pass to gcc so that the code is compiled as-is? I think something like fno-remove-unused , but I don't see anything like it. Ideally, this would be a compiler flag instead of changing the source code, but if it is not supported, is there any gcc attribute / pragma to use?

Things I tried:

 $ gcc --help=optimizers | grep -i remove 

No results.

 $ gcc --help=optimizers | grep -i unused 

No results.

And explicitly disabling all dead code / exception flags - note that there is no warning about unused code:

 $ gcc -O0 -msse2 -Wall -Wextra -pedantic -Winline \ -fno-dce -fno-dse -fno-tree-dce \ -fno-tree-dse -fno-tree-fre -fno-compare-elim -fno-gcse \ -fno-gcse-after-reload -fno-gcse-las -fno-rerun-cse-after-loop \ -fno-tree-builtin-call-dce -fno-tree-cselim ac ac: In function 'main': ac:25:5: warning: ISO C90 forbids mixed declarations and code [-Wpedantic] __m128 s1, s2; ^ $ 



Source program

 #include <stdio.h> #include <signal.h> #include <string.h> #include <xmmintrin.h> static void sigaction_sfpe(int signal, siginfo_t *si, void *arg) { printf("%d,%d,%d\n", signal, si!=NULL?1:0, arg!=NULL?1:0); printf("inside SIGFPE handler\nexit now.\n"); exit(1); } int main() { struct sigaction sa; memset(&sa, 0, sizeof(sa)); sigemptyset(&sa.sa_mask); sa.sa_sigaction = sigaction_sfpe; sa.sa_flags = SA_SIGINFO; sigaction(SIGFPE, &sa, NULL); _mm_setcsr(0x00001D80); __m128 s1, s2; s1 = _mm_set_ps(1.0, 1.0, 1.0, 1.0); s2 = _mm_set_ps(0.0, 0.0, 0.0, 0.0); _mm_div_ss(s1, s2); printf("done (no error).\n"); return 0; } 

Compiling the above program gives

 $ ./a.out done (no error). 

Row change

 _mm_div_ss(s1, s2); 

to

 s2 = _mm_div_ss(s1, s2); // add "s2 = " 

gives the expected result:

 $ ./a.out inside SIGFPE handler 



Edit with more details.

This is similar to the __always_inline__ attribute in the __always_inline__ definition.

 $ cat tc int div(int b) { return 1/b; } int main() { div(0); return 0; } $ gcc -O0 -Wall -Wextra -pedantic -Winline tc -o t.out $ 

(no warnings or errors)

 $ ./t.out Floating point exception $ 

vs lower (same except function attributes)

 $ cat tc __inline int __attribute__((__always_inline__)) div(int b) { return 1/b; } int main() { div(0); return 0; } $ gcc -O0 -Wall -Wextra -pedantic -Winline tc -o t.out $ 

(no warnings or errors)

 $ ./t.out $ 

Adding the __warn_unused_result__ function attribute at least gives a useful message:

 $ gcc -O0 -Wall -Wextra -pedantic -Winline tc -o t.out tc: In function 'main': tc:9:5: warning: ignoring return value of 'div', declared with attribute warn_unused_result [-Wunused-result] div(0); ^ 

edit:

Some discussions of the gcc mailing list . Ultimately, I think that everything works as intended.

+45
c gcc
03 Oct '16 at 13:34
source share
3 answers

GCC does not "optimize" here. It just does not create useless code. It seems like a very common illusion that there is some pure form of code that the compiler should generate, and any changes in this are an “optimization”. There is no such thing.

The compiler creates some data structure, which is a code, and then applies some transformations to this data structure and generates an assembler from it, which is then compiled to instructions. If you compile without "optimization", it simply means that the compiler will only make the least effort to generate the code.

In this case, the whole operator is useless because it does nothing and is immediately discarded (after the extension of the embedded lines and the fact that the built-in values ​​mean that it is equivalent to the a/b; record a/b; ), the difference is that the record a/b; throws a warning about statement with no effect , while inline functions are probably not handled by the same warnings). This is not an optimization, the compiler will really have to spend additional efforts to come up with a meaning for a meaningless statement, and then fake a temporary variable to save the result of this statement and then throw it away.

What you are looking for are not flags to disable optimization, but flags of pessimization. I do not think that compiler developers are wasting time implementing such flags. Except maybe as a joke in April.

+22
03 Oct '16 at 14:16
source share

Why does gcc not throw the specified instruction?

The compiler creates code that must have observable behavior specified by the standard. Everything that is not observed can be changed (and optimized) as desired, since it does not change the behavior of the program (as indicated).

How can you defeat him in submission?

The trick is to get the compiler to assume that the behavior of a particular piece of code is indeed observable.

Since this problem is often found in the micro benchmark, I advise you to see how (for example) Google Benchmark addresses this. From benchmark_api.h we get:

 template <class Tp> inline void DoNotOptimize(Tp const& value) { asm volatile("" : : "g"(value) : "memory"); } 

The details of this syntax are boring, for our purpose we only need to know:

  • "g"(value) indicates that value used as input to a statement
  • "memory" is a compile-time read / write barrier

So, we can change the code to:

 asm volatile("" : : : "memory"); __m128 result = _mm_div_ss(s1, s2); asm volatile("" : : "g"(result) : ); 

What:

  • forces the compiler to consider that s1 and s2 can be changed between their initialization and use
  • forces the compiler to consider that the result of the operation is used

There is no need for any flag, and it should work at any level of optimization (I tested it at https://gcc.godbolt.org/ on -O3).

+32
Oct. 03 '16 at 15:41
source share

I'm not an expert with gcc internals, but it seems your problem is not removing dead code with some optimization package. Most likely, the compiler does not even consider the possibility of generating this code in the first place.

Let me shorten your example from specific compiler properties to a simple old addition:

 int foo(int num) { num + 77; return num + 15; } 

There is no code for + 77 generated :

 foo(int): push rbp mov rbp, rsp mov DWORD PTR [rbp-4], edi mov eax, DWORD PTR [rbp-4] add eax, 15 pop rbp ret 

When one of the operands has side effects, only that operand gets evaluated . However, there are no additions to the assembly.

But storing this result in a (even unused) variable causes the compiler to generate code to add :

 int foo(int num) { int baz = num + 77; return num + 15; } 

Assembly:

 foo(int): push rbp mov rbp, rsp mov DWORD PTR [rbp-20], edi mov eax, DWORD PTR [rbp-20] add eax, 77 mov DWORD PTR [rbp-4], eax mov eax, DWORD PTR [rbp-20] add eax, 15 pop rbp ret 

The following is just an assumption, but from my experience of creating a compiler it is more natural not to generate code for unused expressions, but not to delete this code later.

My recommendation should be explicit about your intentions and put the result of the expression in volatile (and therefore -removable from the optimizer).

@Matthieu M indicated that it is not enough to prevent the preliminary calculation of the value. Therefore, for something more than playing with signals, you should use documented ways to execute the exact instruction you want (perhaps a volatile built-in assembly).

+10
03 Oct '16 at 14:17
source share



All Articles