The fastest way to determine values ​​corresponding to an interval in an array

I have a sorted array intgoing from xto y(the values ​​of the elements are random, but sorted in ascending order using qsort()). The program receives various intervals, such as <10;50>or <50;100>. I have the following simple loop forto determine if values ​​in an array are in a given interval, if so, add it to the counter.

 for(int i = 0; i < arraySize ;i++ )  {        
       if (points[i] >= interval1 && points[i] <= interval2){
            counter++;               
        }
    }

I need a faster way than O(n)to search through an array and determine if a value is points[i]in a given interval or not. Values ​​can be in millions, so they slow down drastically.

Array elements can range from 0 to 1,000,000,000 (1e9). Intervals respectively.

+4
source share
4 answers

Use binary search - for the input interval [i, j], find the index of the smallest integer that is greater than i, find the index of the largest integer that is less than j, and then return the distance between them.

ssize_t bin_search_first_larger(int arr[], size_t arr_sz, int val) {
    ssize_t l = -1;
    ssize_t r = arr_sz;
    /* invariant: arr[l] < val && val <= arr[r] */
    while (l+1 != r) {
        ssize_t m = l+(r-l)/2;
        if (arr[m] < val) {
            l = m;
        } else {
            r = m;
        }
    }
    /* l+1 == r && arr[l] < val && val <= arr[r] */
    return r;
}

ssize_t bin_search_last_smaller(int arr[], size_t arr_sz, int val) {
    ssize_t l = -1;
    ssize_t r = arr_sz;
    /* invariant: arr[l] <= val && val < arr[r] */
    while (l+1 != r) {
        ssize_t m = l+(r-l)/2;
        if (arr[m] <= val) {
            l = m;
        } else {
            r = m;
        }
    }
    /* l+1 == r && arr[l] <= val && val < arr[r] */
    return l;
}

ssize_t values_in(int arr[], size_t arr_sz, int x, int y) {
    ssize_t i = bin_search_first_larger(arr, arr_sz, x);
    ssize_t j = bin_search_last_smaller(arr, arr_sz, y);
    return j-i+1;
}

The binary search code is adapted from Jon Bentley Programming Pearls (which is well worth reading), which shows how a binary search can be modified to return either the first occurrence or the last occurrence of a value in a sorted array with duplicates (instead of returning an arbitrary occurrence of a duplicate values). The process is similar to your use case, the difference is subtle.

, , arr[-1] - , arr[N] - ( N - ), , , .

O(log(N)), N - , (?) - .

, , , ( , y , , x , , x y , ), , , , , .

+2

, , O(log n), O(1) (time), [a,b].

, , O(MAXVALUE+NVALUES). MAXVALUE , , NVALUES - .

  • 0 -
  • 1,000,000,000 -

O(1) , MAXVALUE+1 int. 1bn, 1GB x sizeof(int) (gcc Linux x86_64, 4 ). 64 .

.

  • ( ): m[0, 1bn] i

  • [a, b] m[a] - m[b+1]
    ( b+1 > MAXVALUE, 0)

:

#define MAXVALUE 1000000000
#define NVALUES     1000000

int *m; // big array

void initialization(int *values)
{
   m = malloc((MAXVALUE+1) * sizeof(*m)); // check if NULL!

   int i,j;

   for(j=0,i=0 ; i<=MAXVALUE ; ) {
      if (j >= NVALUES) m[i++] = 0;
      else if (values[j] >= i) m[i++] = NVALUES-j;
      else j++;
   }
}

[a, b] a<=b:

int count_in_range(int a, int b) {
   int ma = m[a];
   int mb = b >= MAXVALUE ? 0 : m[b+1];
   return ma-mb;
}

m .

+1

:

// position of first element greater than interval2
auto lb = std::upper_bound(array.begin(), array.end(), interval2);
// position of first element greater or equal than interval1
auto ub = std::lower_bound(array.begin(), array.end(), interval1);
// their difference is the number of elements in the needed range
return (ub - lb);

O(log N), / O(log N).

: , C. C / . , impement lower_bound, upper_bound lower_bound(interval2 + 1).

0

BinSearch O (logN).

    int BinSearch(int *array, int first, int last, int value){

          int m;
          /* Optional Error control */
          if (!array || first<0 || last<first)  return -1;

          while (first <= last){

                  m = (first + last)/2;

                  if(array[m] == value) return m;

                  if(value < array[m]) last = m-1;

                  else
                       first = m+1;
            }
           /* Failure search */
           return -1;
     }

-1, , .

You can make an option that returns 1, if you find a value or 0, then you can do

      counter += BinSearch_variant(array,first,last,value);
-1
source

Source: https://habr.com/ru/post/1619706/


All Articles