Still worth trying to create optimizations for sqrt () in C?

Are the old lookup table, approx functions tricks for creating faster sqrt () implementations still useful or are the default implementations as fast as they are for modern compilers and hardware?

+3
source share
8 answers

Rule 1: profile before optimization

Before you put any effort into the belief that you can defeat the optimizer, you should comment on everything and find out where the bottleneck actually lies. In general, it is unlikely that sqrt()this is your bottleneck.

Rule 2: replace the algorithm before replacing the standard function

sqrt() , - , ( , - ), sqrt() .

,

C CRT , , sqrt(), .

, MinGW gcc v3.4.5 sqrt() , FPU FSQRT. , C IEEE 754, FSQRT , sqrt() , ​​ .

sqrt() double, .

( ) , , .

, , . , (-b + sqrt(b*b - 4.*a*c)) / (2*a) ?

, , , .

.

, , , , C , .

: , , . , .

Edit2: ​​ , kmm .

+16

Sqrt . , , "".

() , , . ( ), - .

, , " ".

+4

, sqrt() , , . .

+3

, . .

, , -, ? ( , , ), - , ?

? , , .

, ? , CPU sqrt, , . , , CPU, , sqrt().

, ?

" sqrt".

" ".

+2

, , :

float fastsqrt(float val)  {
        union
        {
                int tmp;
                float val;
        } u;
        u.val = val;
        u.tmp -= 1<<23; /* Remove last bit so 1.0 gives 1.0 */
        /* tmp is now an approximation to logbase2(val) */
        u.tmp >>= 1; /* divide by 2 */
        u.tmp += 1<<29; /* add 64 to exponent: (e+127)/2 =(e/2)+63, */
        /* that represents (e/2)-64 but we want e/2 */
        return u.val;
}


, , . 0,00175228 .

float InvSqrt (float x)
{
    float xhalf = 0.5f*x;
    int i = *(int*)&x;
    i = 0x5f3759df - (i>>1);
    x = *(float*)&i;
    return x*(1.5f - xhalf*x*x);
}

( ) 4 , (float)(1.0/sqrt(x))

+2

? , , !

+1

, sqrt - , . , - , , , (, L1 L2), .

0

- , .

, , , SIMD: rsqrtps. - , . rsqrtps , , ( , , ).

sqrt, , , . , , . , C sqrt, .

(, ), , , , , malloc. malloc , , . , , .

0

Source: https://habr.com/ru/post/1709324/


All Articles