Is there a name for this type of binary search?

When writing code today, I came across a circumstance that made me write binary searches of a kind that I had never seen before. Does this binary search have a name, and is it a binary search?

Motivation

First of all, to simplify the search, I will explain the example of use that gave rise to its creation.

Say you have a list of ordered numbers. You will be asked to find the index number in the list closest to x.

int findIndexClosestTo(int x); 

The calls to findIndexClosestTo() always follow this rule:

If the last result of findIndexClosestTo() was i , then indexes closer to i are more likely to be the result of the current call to findIndexClosestTo() .

In other words, the index that we need to find this time is more likely to be closer to the last one found, the farther from it.

For example, imagine a simulated boy walking on the left and right of the screen. If we often refer to the location index of the boy, most likely he is somewhere near the last place where we found him.

Algorithm

In the above case, we know that the last result of findIndexClosestTo() was i (if this is really the first time the function has been called, i is used by default for the average index of the list, for simplicity, although a separate binary search to find the result of the first call, will be actually faster), and the function was called again. Given the new number x , we follow this algorithm to find its index:

  • interval = 1;
  • Is this the number we are looking for, x located at i ? If yes, return i ;
  • If not, determine if x above or below i . (Remember that the list is sorted.)
  • Move interval indices in the x direction.
  • If we found x in our new location, return this location.
  • Double interval . (i.e. interval *= 2 )
  • If we passed x , go back to interval indexes, set interval = 1 , go to 4.

Given the probability rule indicated above (under the heading Motivation), this seems to me to be the most effective way to find the right index. Do you know a faster way?

+6
source share
4 answers

What you do is the (IMHO) version of Interpolation Search

In the interpolation search, you assume that the numbers are equally distributed, and then you try to guess the location of the number from the first and last number and the length of the array.

In your case, you change the interpolation algorithm in such a way that you assume that the Key is very close to the last number you were looking for.

Also note that your algorithm is similar to algo, where TCP tries to find the optimal packet size. (I do not remember the name :()

  • Start slow
    • Double spacing
    • if the package does not reboot from the last successful package. / Reboot from the default package size .. 3.
+3
source

In the worst case, your algorithm is O ((log n) ^ 2).

Suppose you start at 0 (with an interval = 1), and the value you are looking for is in the position 2 ^ n - 1.

First you check 1, 2, 4, 8, ..., 2 ^ (n-1), 2 ^ n. Oops, this is overshoot, so go back to 2 ^ (n-1).

Next, you check 2 ^ (n-1) +1, 2 ^ (n-1) +2, ..., 2 ^ (n-1) + 2 ^ (n-2), 2 ^ (n -1) + 2 ^ (n-1). This last term is 2 ^ n, so it screams, which again overshadows. Back to 2 ^ (n-1) + 2 ^ (n-2).

And so on, until you finally reach 2 ^ (n-1) + 2 ^ (n-2) + ... + 1 == 2 ^ n - 1.

The first exceeded log n steps. The next step is (log n) -1. The following took (log n) - 2 steps. Etc.

So, in the worst case, you followed steps 1 + 2 + 3 + ... + log n == O ((log n) ^ 2).

The best idea, I think, is to switch to traditional binary search as soon as you redo the first time. This will preserve the performance of the O (log n) algorithm in the worst case, although it will be slightly faster when the target is really nearby.

I do not know the name for this algorithm, but I like it. (By a strange coincidence, I could use it yesterday. Indeed.)

+4
source

Your procedure is typical of interpolation routines. You don’t lose much if you call it random numbers (~ standard binary search), but if you call it with slowly increasing numbers, you don’t need much time to find the correct index.

Thus, this is a reasonable default behavior for finding an ordered table for interpolation purposes.

This method is discussed with great length in Numerical Recipes 3rd edition, section 3.1.

+1
source

This speaks of my head, so I have nothing to reinforce, but the gut feeling!

In step 7, if we go through x, it may be faster to halve the interval and go back to x - effectively, interval = -(interval / 2) , rather than reset interval to 1.

I will have to sketch a few numbers on paper, though ...

Edit: Apologies - I'm saying nonsense above: ignore me! (And I'll leave and think about it this time ...)

0
source

Source: https://habr.com/ru/post/895982/