Difference between basic binary search of upper bound and lower bound?

In the article http://community.topcoder.com/tc?module=Static&d1=tutorials&d2=binarySearch, the author discusses binary search. He makes a distinction between finding the smallest value when something is true and the highest value when something is false. When searching for an array, it looks something like this:

false false false true true

I am curious why these two cases are different. Why can't you find the lowest value that is true, and then subtract it to find the highest value that is false?

Edit2: Okay, so I understand the lower and upper bounds. Now I'm trying to understand that when looking for the smallest integer greater than or equal to the query, why can't we just change if(mid>query) to if(mid>=query) and make it lower than the upper bound.

Edit: This is what the article says:

"Now we’ll finally move on to code that implements binary search, as described in this and the previous section:

 binary_search(lo, hi, p): while lo < hi: mid = lo + (hi-lo)/2 if p(mid) == true: hi = mid else: lo = mid+1 if p(lo) == false: complain // p(x) is false for all x in S! return lo // lo is the least x for which p(x) is true 

...

If we wanted to find the last x for which p (x) is false, we would develop (using a similar reasoning as described above) something like:

 binary_search(lo, hi, p): while lo < hi: mid = lo + (hi-lo+1)/2 // note: division truncates if p(mid) == true: hi = mid-1 else: lo = mid if p(lo) == true: complain // p(x) is true for all x in S! return lo // lo is the greatest x for which p(x) is false 

"

+6
source share
3 answers

The lower and upper bounds of the binary search are the lowest and highest positions where the value can be inserted without disturbing the order. (In the C ++ standard library, these boundaries will be represented by iterators referring to an element in front of which a value can be inserted, but the concept has not been substantially changed.)

Take, for example, a sorted range

 1 2 3 4 5 5 5 6 7 9 

In binary search 3 we will have

  v-- lower bound 1 2 3 4 5 5 5 6 7 9 ^-- upper bound 

And in binary search 5 :

  v-- lower bound 1 2 3 4 5 5 5 6 7 9 ^-- upper bound 

The lower and upper borders are the same if the element does not exist in the range. In binary search 8 :

  v-- lower bound 1 2 3 4 5 5 5 6 7 9 ^-- upper bound 

The author of the article you are referring to says all this in equivalent terms “less” and “more”, so that when searching for 5,

  v-- lower bound ttttffffff <-- smaller than? 1 2 3 4 5 5 5 6 7 9 fffffffttt <-- greater than? ^-- upper bound 

In all these cases, C ++ iterators refer to an element immediately abroad. I.e:

  • In search 3 iterator returned by std::lower_bound will refer to 3 , and one of std::upper_bound will refer to 4
  • In search 5 iterator returned by std::lower_bound will refer to the first 5 , and one of std::upper_bound will refer to 6
  • In search 8 both will refer to 9

This is because the convention in the C ++ standard library for insertion is to pass an iterator that references the element before which the new element should be inserted. For example, after

 std::vector<int> vec { 1, 3, 4, 5, 5, 5, 6, 7, 9 }; vec.insert(vec.begin() + 1, 2); 

vec will contain 1, 2, 3, 4, 5, 5, 5, 6, 7, 9 . std::lower_bound and std::upper_bound follow this convention to

 vec.insert(std::lower_bound(vec.begin(), vec.end(), 5), 5); vec.insert(std::upper_bound(vec.begin(), vec.end(), 8), 8); 

work as desired and leave vec sorted.

In general, this is an expression of how ranges are defined in the C ++ standard library. The start iterator of the range refers to the first element of the range (if any), while the end iterator refers to the element (if any) immediately after the end of the range. Another way to look at this is that the iterators returned by std::lower_bound and std::upper_bound cover the range of elements in the search range that are equivalent to the element to be searched.

This range is empty if the element is not in the range, therefore lower_bound and upper_bound return the same iterator, otherwise lower_bound returns an iterator, referring to the first element in the search range, which is equivalent to the search value, and upper_bound returns an iterator referring to an element (if any) that immediately follows the last such element.

+24
source

If the array will always be

 falsetrue 

Then the index before the one you find will always be false unless you find true at index 0 . Another borderline case, as mentioned in my comment above, is if you did not find true . Then the highest false value will be the last part of the array.

+1
source

Both algorithms obviously differ in the condition of what should happen if there is no true or no false value, as is really obvious from the code fragment: if you find the smallest value, where the value is true and subtract 1 from this position to find the highest value, giving false , the wrong result is obtained, since there is no such object. Since the algorithms simply target different elements related to the localization of the corresponding element directly, and not with a special case, they also avoid dealing with a special case, reducing the amount of code. Since the special case code is usually executed only once for each invocation of the algorithm, it is likely to be slightly worse than avoiding the special case. This is something that can be appreciated.

Please note that the sample code is not C ++, even though the question is marked as C ++. As a result, this is not idiomatic C ++. A typical C ++ approach for implementing something like lower_bound() or upper_bound() is to use appropriate iterators. These algorithms will not "complain" if there is no suitable element, since they simply create an iterator of the corresponding position, i.e. The iterator to start for std::lower_bound() and the iterator of the past end for std::upper_bound() .

0
source

Source: https://habr.com/ru/post/982155/


All Articles