rootn (float_t x, int_t n) is a function that computes the nth root x 1 / n and is supported by some programming languages such as OpenCL . When IEEE-754 floating-point numbers are used, effective low-current starting approximations for any n can be generated based on a simple manipulation of the basic bit pattern, assuming that only normalized x operands should be processed.
The binary exponent root (x, n) will be 1 / n of the binary exponent x . The IEEE-754 floating point exponent field is offset. Instead of not biasing the exponent, dividing it and re-biasing the result, we can simply divide the biased exponent by n , and then apply the bias to compensate for the previously forgotten bias. In addition, instead of highlighting, dividing the exponent field, we can simply divide the entire operand x , reinterpreted as an integer. The required offset is trivial, to find the value 1 as an argument, will return the result 1 for any n .
If we have two helper functions, __int_as_float() , which reinterpret the IEEE-754 binary32 as int32 and __float_as_int() , which reinterpret the int32 operand as binary32 , we come to following the low-precision approach to rootn (x, n) simple way:
rootn (x, n) ~= __int_as_float((int)(__float_as_int(1.0f)*(1.0-1.0/n)) + __float_as_int(x)/n)
The integer division __float_as_int (x) / n can be reduced to a shift or multiplication plus a shift of the known optimizations of the integer division by a constant divisor . Some examples:
rootn (x, 2) ~= __int_as_float (0x1fc00000 + __float_as_int (x) / 2) // sqrt (x) rootn (x, 3) ~= __int_as_float (0x2a555556 + __float_as_int (x) / 3) // cbrt (x) rootn (x, -1) ~= __int_as_float (0x7f000000 - __float_as_int (x) / 1) // rcp (x) rootn (x, -2) ~= __int_as_float (0x5f400000 - __float_as_int (x) / 2) // rsqrt (x) rootn (x, -3) ~= __int_as_float (0x54aaaaaa - __float_as_int (x) / 3) // rcbrt (x)
For all these approximations, the result will be accurate only when x = 2 n * m for integer m . Otherwise, the approximation will provide a reassessment compared to the true mathematical result. We can approximately halve the maximum relative error by slightly reducing the bias, which will lead to a balanced combination of underestimation and revaluation. This is easily accomplished by binary searching for the optimal offset, which uses all floating-point numbers in the interval [1, 2 n ) as test cases. In doing so, we find:
rootn (x, 2) ~= __int_as_float (0x1fbb4f2e + __float_as_int(x)/2) // max rel err = 3.47474e-2 rootn (x, 3) ~= __int_as_float (0x2a51067f + __float_as_int(x)/3) // max rel err = 3.15547e-2 rootn (x,-1) ~= __int_as_float (0x7ef311c2 - __float_as_int(x)/1) // max rel err = 5.05103e-2 rootn (x,-2) ~= __int_as_float (0x5f37642f - __float_as_int(x)/2) // max rel err = 3.42128e-2 rootn (x,-3) ~= __int_as_float (0x54a232a3 - __float_as_int(x)/3) // max rel err = 3.42405e-2
Some may notice that computing for rootn (x,-2) is basically the initial part of Quake fast inverse square root .
Based on the differences between the original source offset and the final offset, optimized to minimize the maximum relative error, I could formulate a heuristic for secondary correction and, thus, the final, optimized offset value.
However, I am wondering if it is possible to determine the optimal bias by a closed-loop formula so that the maximum absolute value of the relative error max (| (approx (x, n) - x 1 / n ) / x 1 / n |) is minimized for all x [1,2 n ). For convenience of presentation, we can limit the numbers to binary32 (IEEE-754 with one precision).
I know that in general for minimax approximations there is no closed-form solution, however, I get the impression that closed-form solutions exist for the case of polynomial approximations to algebraic functions, such as the nth root. In this case, we have a (piecewise) linear approximation.