Capturing SIGFPE from SIMD instruction

I am trying to clear a floating point division with a null flag to ignore this exception. I expect that with the flag set (without changing the default behavior, which, I believe, and the comments below), my error handler will fire. However, _mm_div_ss does not seem to raise SIGFPE. Any ideas?

 #include <stdio.h> #include <signal.h> #include <string.h> #include <xmmintrin.h> static void sigaction_sfpe(int signal, siginfo_t *si, void *arg) { printf("inside SIGFPE handler\nexit now."); exit(1); } int main() { struct sigaction sa; memset(&sa, 0, sizeof(sa)); sigemptyset(&sa.sa_mask); sa.sa_sigaction = sigaction_sfpe; sa.sa_flags = SA_SIGINFO; sigaction(SIGFPE, &sa, NULL); //_mm_setcsr(0x00001D80); // catch all FPE except divide by zero __m128 s1, s2; s1 = _mm_set_ps(1.0, 1.0, 1.0, 1.0); s2 = _mm_set_ps(0.0, 0.0, 0.0, 0.0); _mm_div_ss(s1, s2); printf("done (no error).\n"); return 0; } 

Code output above:

 $ gcc ac $ ./a.out done (no error). 

As you can see, my handler has never been reached. Note: I tried a couple of different compiler flags (-msse3, -march = native) without changes.

gcc (Debian 5.3.1-7) 5.3.1 20160121

Some information from / proc / cpuinfo

 model name : Intel(R) Core(TM) i3 CPU M 380 @ 2.53GHz flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf pni dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm pcid sse4_1 sse4_2 popcnt lahf_lm arat dtherm tpr_shadow vnmi flexpriority ept vpid 
+1
c signals simd
Sep 25 '16 at 18:16
source share
1 answer

Two things.

Firstly, I misunderstood the documentation. Exceptions must be exposed to be caught. Call _mm_setcsr(0x00001D80); will allow SIGFPE to fire division by zero.

Secondly, gcc optimized my division instruction even with -O0 .

Given source string

 _mm_div_ss(s1, s2); 

Compiling with gcc -S -O0 -msse2 ac gives

 76 movaps -24(%ebp), %xmm0 77 movaps %xmm0, -72(%ebp) 78 movaps -40(%ebp), %xmm0 79 movaps %xmm0, -88(%ebp) a1 subl $12, %esp ; renumbered to show insertion below a2 pushl $.LC2 a3 call puts a4 addl $16, %esp 

While the source line

 s2 = _mm_div_ss(s1, s2); // add "s2 = " 

gives

 76 movaps -24(%ebp), %xmm0 77 movaps %xmm0, -72(%ebp) 78 movaps -40(%ebp), %xmm0 79 movaps %xmm0, -88(%ebp) movaps -72(%ebp), %xmm0 divss -88(%ebp), %xmm0 movaps %xmm0, -40(%ebp) a1 subl $12, %esp a2 pushl $.LC2 a3 call puts a4 addl $16, %esp 

With these changes, the SIGFPE handler is called according to the divide-by-zero flag in MXCSR.

+1
Sep 26 '16 at 2:04 on
source share
โ€” -



All Articles