Other behavior or sqrt when compiling with 64 or 32 bits

I use the sqrt () function from the math library when I build for 64bit using -m64. I get the correct result, but when I build for 32-bit, I have very inconsistent behavior.

For example, on a 64-bit page

double dx = 0x1.fffffffffffffp+1023; sqrt(dx); // => 0x1.fffffffffffffp+511 sqrt(0x1.fffffffffffffp+1023);// => 0x1.fffffffffffffp+511 

(which, I believe, is a properly rounded result tested with mpfr)

But on a 32-bit input value, it behaves differently.

 double dx = 0x1.fffffffffffffp+1023; sqrt(dx); // => 0x1.0p+512 sqrt(0x1.fffffffffffffp+1023); // => 0x1.fffffffffffffp+511 

When the same value is passed to a variable, I get the wrong result. I checked the rounding mode before and after each call, and they are all set to round to the nearest. What reason? I use gcc 4.6 on a 64-bit machine, and the -mfpmath=sse and -march=pentium options for both x86 nad x64 cases.

+6
source share
2 answers

You did not say which compiler or architect you use, but assuming gcc on x86 / x86-64 then the difference is most likely due to the fact that by default gcc uses 387 floating-point instructions for 32 bits x86, whereas it uses SSE instructions on x86-64.

387 floating point registers are 80 bits wide, and double is 64 bits wide. This means that intermediate results can be more accurate using instructions 387, which can lead to a slightly different answer after rounding. (SSE2 instructions work on packed 64-bit doubles).

There are several ways to change the way the compiler works, depending on what you want:

  • If you use the -ffloat-store option to build x86, the compiler will discard extra precision whenever you store a value in a double variable;
  • If you use the -mfpmath=sse to build x86, as well as -msse2 or -march= , which specifies an architecture that supports SSE2, the compiler will use the SSE instructions for floating point exactly the same as on x86-64. The code will only work on processors that support SSE2 (Pentium-M / Pentium 4 and higher).
  • If you use the -mfpmath=387 option to build x86-64, the compiler will use 387 floating point instructions in exactly the same way as on x86. This is not recommended, however - ABI x86-64 indicates that floating point values ​​are passed to SSE registers, so the compiler must do a lot of shuffling between 387 and SSE registers with this option.
+6
source

Some compilers, such as gcc , when they see certain functions of a mathematical library that are executed on static literals, actually calculate the value at compile time, where, like with a variable, it is calculated at runtime as necessary. The compilation time value is usually calculated by the compiler using a mathematical library such as MPFR, GNU MP, etc. Therefore, the results will be more accurate or, at least, as accurate as possible, from platform to platform.

+6
source

Source: https://habr.com/ru/post/916958/