What does the clang `-fast` option do in practical terms, especially for any differences from gcc?

Like the SO question. What does gcc ffast-math actually do? and related to the issue of SO Clang optimization levels , I wonder what optimizes clang -Ofastand , regardless of whether they differ from gcc or it depends more on the hardware than on the compiler .

In accordance with the response to the level of optimization clang, -Ofastadds to the optimization -O3: -fno-signed-zeros -freciprocal-math -ffp-contract=fast -menable-unsafe-fp-math -menable-no-nans -menable-no-infs. It seems to be floating point math. But what do these optimizations mean in practical terms for things like C ++ General mathematical functions for floating point numbers on a processor such as Intel Core i7, and how reliable are these differences?

For example, in practice:

The code std::isnan(std::numeric_limits<float>::infinity() * 0)returns true for me with -O3. I believe this is what is expected from IEEE-compliant results.

With -Ofast, however, I get a false return value. In addition, the operation (std::numeric_limits<float>::infinity() * 0) == 0.0freturns true.

I do not know, this is the same as with gcc. I don’t understand how much the architecture depends on the results, as well as on how much they depend on the compiler, and whether there is an applicable standard for -Ofast.

If someone might have done something like a unit tests or code koans suite that answers that, that might be ideal. I started to do something similar, but did not invent a wheel.

+4
source share
1 answer

, , , . , , .


-fno-signed-zeros

, .
FP w.r.t. : 0 · x = x · 0 ≠ 0, , , , -3 · 0 = -0 ≠ 0 ( 0 +0).

live on Godbolt, -Ofast

float f(float a)
{
    return a*0;
}

;With -Ofast
f(float):                                  # @f(float)
        xorps   xmm0, xmm0
        ret

;With -O3
f(float): # @f(float)
  xorps xmm1, xmm1
  mulss xmm0, xmm1
  ret

A EOF , .

-freciprocal-math

: a/b = a · (1/b).
- FP .
, , . .
. why-is-freciprocal-math-unsafe-in-gcc?.

Live Godbolt:

float f(float a){
    return a/3;
}

;With -Ofast
.LCPI0_0:
        .long   1051372203              # float 0.333333343
f(float):                                  # @f(float)
        mulss   xmm0, dword ptr [rip + .LCPI0_0]
        ret

;With -O3
.LCPI0_0:
  .long 1077936128 # float 3
f(float): # @f(float)
  divss xmm0, dword ptr [rip + .LCPI0_0]
  ret

-ffp-contract=fast

FP.
, field ℝ, . , a * k/k = a.

FP, + ·, - . FP .

Live Godbolt:

float f(float a){
    return a/3*3;
}

;With -Ofast 
f(float):                                  # @f(float)
        ret

;With -O3
.LCPI0_0:
  .long 1077936128 # float 3
f(float): # @f(float)
  movss xmm1, dword ptr [rip + .LCPI0_0] # xmm1 = mem[0],zero,zero,zero
  divss xmm0, xmm1
  mulss xmm0, xmm1
  ret

-menable-unsafe-fp-math

, .

, IEEE (, ) . , (, fsin X86).

. fsin.

Godbolt, a 4 (a 2/sup > ) 2

float f(float a){
    return a*a*a*a;
}

f(float):                                  # @f(float)
        mulss   xmm0, xmm0
        mulss   xmm0, xmm0
        ret

f(float): # @f(float)
  movaps xmm1, xmm0
  mulss xmm1, xmm1
  mulss xmm1, xmm0
  mulss xmm1, xmm0
  movaps xmm0, xmm1
  ret

-menable-no-nans

, NaN.
, , NaN.

FP NaN .
, , live Godbolt

bool f(float a, float b){
    return a<b;
}

;With -Ofast
f(float, float):                                 # @f(float, float)
        ucomiss xmm0, xmm1
        setb    al
        ret

;With -O3
f(float, float): # @f(float, float)
  ucomiss xmm1, xmm0
  seta al
  ret

, , -O3 , a b , true.
, /.

-menable-no-infs

, .

Godbolt, , .

mb- glibc (, sinc) , -Ofast.

+6

Source: https://habr.com/ru/post/1683719/


All Articles