Efficient algorithm for computing the integer square root (isqrt) of arbitrarily large integers

Note

For a solution in Erlang or C / C++ continue to the next section 4.


Wikipedia Articles

Integer square root

  • The definition of integer square root can be found here.

Square Root Calculation Methods

  • You can find the bit magic algorithm here.

[Test 1: Using the library function]

code

 isqrt(N) when erlang:is_integer(N), N >= 0 -> erlang:trunc(math:sqrt(N)). 

Problem

This implementation uses the sqrt() function from the C library, so it does not work with arbitrarily large integers (note that the returned result does not match the input. The correct answer should be 12345678901234567890 ):

 Erlang R16B03 (erts-5.10.4) [source] [64-bit] [smp:8:8] [async-threads:10] [hipe] [kernel-poll:false] Eshell V5.10.4 (abort with ^G) 1> erlang:trunc(math:sqrt(12345678901234567890 * 12345678901234567890)). 12345678901234567168 2> 

[Test 2: Only use Bigint + ]

code

 isqrt2(N) when erlang:is_integer(N), N >= 0 -> isqrt2(N, 0, 3, 0). isqrt2(N, I, _, Result) when I >= N -> Result; isqrt2(N, I, Times, Result) -> isqrt2(N, I + Times, Times + 2, Result + 1). 

Description

This implementation is based on the following observation:

 isqrt(0) = 0 # <--- One 0 isqrt(1) = 1 # <-+ isqrt(2) = 1 # |- Three 1's isqrt(3) = 1 # <-+ isqrt(4) = 2 # <-+ isqrt(5) = 2 # | isqrt(6) = 2 # |- Five 2's isqrt(7) = 2 # | isqrt(8) = 2 # <-+ isqrt(9) = 3 # <-+ isqrt(10) = 3 # | isqrt(11) = 3 # | isqrt(12) = 3 # |- Seven 3's isqrt(13) = 3 # | isqrt(14) = 3 # | isqrt(15) = 3 # <-+ isqrt(16) = 4 # <--- Nine 4's ... 

Problem

This implementation includes adding only bigint, so I expected it to work quickly. However, when I fed it 1111111111111111111111111111111111111111 * 1111111111111111111111111111111111111111 , it seems to have been running forever on my (very fast) machine.


[Trial 3: Using binary search with Bigint +1 , -1 and div 2 Only]

code

Option 1 (My initial implementation)

 isqrt3(N) when erlang:is_integer(N), N >= 0 -> isqrt3(N, 1, N). isqrt3(_N, Low, High) when High =:= Low + 1 -> Low; isqrt3(N, Low, High) -> Mid = (Low + High) div 2, MidSqr = Mid * Mid, if %% This also catches N = 0 or 1 MidSqr =:= N -> Mid; MidSqr < N -> isqrt3(N, Mid, High); MidSqr > N -> isqrt3(N, Low, Mid) end. 

Option 2 (the code modified above, so that the borders go using Mid + 1 or Mid-1, referring to the answer from Vikram Bhat )

 isqrt3a(N) when erlang:is_integer(N), N >= 0 -> isqrt3a(N, 1, N). isqrt3a(N, Low, High) when Low >= High -> HighSqr = High * High, if HighSqr > N -> High - 1; HighSqr =< N -> High end; isqrt3a(N, Low, High) -> Mid = (Low + High) div 2, MidSqr = Mid * Mid, if %% This also catches N = 0 or 1 MidSqr =:= N -> Mid; MidSqr < N -> isqrt3a(N, Mid + 1, High); MidSqr > N -> isqrt3a(N, Low, Mid - 1) end. 

Problem

Now it resolves to a 79-digit number (namely 1111111111111111111111111111111111111111 * 1111111111111111111111111111111111111111 ) in the speed of lighting, the result is displayed immediately. However, it takes 60 seconds (+ - 2 seconds) for my machine to resolve one million (1,000,000) 61-digit numbers (namely, between 1000000000000000000000000000000000000000000000000000001000000 ). I would like to do it even faster.


[Trial 4: Using the Newton method only with Bigint + and div ]

code

 isqrt4(0) -> 0; isqrt4(N) when erlang:is_integer(N), N >= 0 -> isqrt4(N, N). isqrt4(N, Xk) -> Xk1 = (Xk + N div Xk) div 2, if Xk1 >= Xk -> Xk; Xk1 < Xk -> isqrt4(N, Xk1) end. 

Code in C / C ++ (for your interest)

Recursive option

 #include <stdint.h> uint32_t isqrt_impl( uint64_t const n, uint64_t const xk) { uint64_t const xk1 = (xk + n / xk) / 2; return (xk1 >= xk) ? xk : isqrt_impl(n, xk1); } uint32_t isqrt(uint64_t const n) { if (n == 0) return 0; if (n == 18446744073709551615ULL) return 4294967295U; return isqrt_impl(n, n); } 

Iterative option

 #include <stdint.h> uint32_t isqrt_iterative(uint64_t const n) { uint64_t xk = n; if (n == 0) return 0; if (n == 18446744073709551615ULL) return 4294967295U; do { uint64_t const xk1 = (xk + n / xk) / 2; if (xk1 >= xk) { return xk; } else { xk = xk1; } } while (1); } 

Problem

Erlang code resolves one million (1,000,000) 61-digit numbers in 40 seconds (+ - 1 second) on my machine, so it's faster than trial 3. Can it go even faster?


About my computer

Processor: 3.4 GHz Intel Core i7

Memory: 32 GB 1600 MHz DDR3

OS: Mac OS X Version 10.9.1


Matters Related

Integer square root in python

  • Answer user448810 uses the "Newton Method" . I am not sure if separation using "integer division" works or not. I will try this later as an update. [UPDATE (2015-01-11): This is normal)

  • The answer in mathematics includes the use of a third-party Python package gmpy , which is not very profitable for me, since I am primarily interested in solving it in Erlang only with built-in tools.

  • DSM's answer seems interesting. I really do not understand what is happening, but it seems that bit magic is involved, and therefore it is not suitable for me either.

Infinite recursion squared squares with square mark

  • This question is for C ++, and the AraK algorithm (questionnaire) looks the same as the previous version of Trial 2.
+5
source share
1 answer

How about a binary search like the following, no floating divisions of only the whole multiplications are required (slower than the Newton method): -

 low = 1; /* More efficient bound high = pow(10,log10(target)/2+1); */ high = target while(low<high) { mid = (low+high)/2; currsq = mid*mid; if(currsq==target) { return(mid); } if(currsq<target) { if((mid+1)*(mid+1)>target) { return(mid); } low = mid+1; } else { high = mid-1; } } 

This works for O(logN) iterations, so it should not be done forever even for very large numbers

Log10 (target) Calculation if necessary: ​​-

 acc = target log10 = 0; while(acc>0) { log10 = log10 + 1; acc = acc/10; } 

Note: acc/10 is a whole division

Edit: -

Effective Border:. sqrt (n) has about half the number of digits as n, so you can pass high = 10^(log10(N)/2+1) && low = 10^(log10(N)/2-1) to get a tighter binding, and it should provide acceleration 2 times.

Rating: -

 bound = 1; acc = N; count = 0; while(acc>0) { acc = acc/10; if(count%2==0) { bound = bound*10; } count++; } high = bound*10; low = bound/10; isqrt(N,low,high); 
+2
source

Source: https://habr.com/ru/post/969829/


All Articles