Which is more efficient, atan2 or sqrt?

There are several situations where there are several methods for calculating the same value.

Now I am thinking of an algorithm for "expanding" a two-dimensional convex polygon. To do this, I want to find which direction outrages each vertex. To get a result that expands a polygon with "skin" of the same thickness around, the amount that is perturbed in this direction also depends on the angle at the vertex. But right now I'm just worried about the direction.

One way is to use atan2: let B be my vertex, A the previous vertex, and C the next vertex. My direction is the "angular mean" angle(BA) and angle(BC) .

Another way involves sqrt: unit(BA)+unit(BC) , where unit(X) is X/length(X) , gives a vector with my direction.

I tend to method number 2, because the values ​​of the averaging angle require a little work. But I basically choose between two atan2 calls and two sqrt calls. What is going faster? And if I did it in a shader program?

I am not trying to optimize my program as such, I would like to know how these functions are usually implemented (for example, in standard c libraries), so I can learn, in general, what is the best choice.

From what I know, both sqrt and trig functions require the iterative method to come up with an answer. It is for this reason that we try to avoid them whenever possible. People have come up with "approximate" functions that use lookup and interpolation tables and such to try to get faster results. Of course, I will never worry about this unless I find convincing evidence of a bottleneck in my code because these procedures or routines are strongly related to them, but the differences between sqrt, trigger combinations, and inverse trigger functions can make a difference for discussion,

+4
source share
3 answers

With typical libraries on common modern hardware, sqrt faster than atan2 . Cases where atan2 faster may exist, but they are few and far between.

Recent x86 implementations do have a pretty effective sqrt instruction, and on this hardware the difference can be quite dramatic. Intel Optimization Guide quotes the square root with one precision as 14 cycles on Sandybridge, and the square root with double precision as 22 cycles. With a good atan2 math library, timings are usually in the region of 100 cycles or more.

+19
source

It seems that you have all the information you need for the profile and find out for yourself.

If you are not looking for the exact result and are not against the additional logic necessary for its operation, you can use specialized operations, such as RSQRTSS , RSQRTPS , which calculate 1/sqrt , to combine the two expensive operations.

+1
source

Indeed, sqrt is better than atan2, and 1 / sqrt is better than sqrt.

For a non-embedded solution, you might be interested in the CORDIC approximations .

But in your case, you must develop complete formulas and optimize them all over the world before making any conclusions, because a transcendent function is just part of the calculation.

0
source

Source: https://habr.com/ru/post/1396902/


All Articles