Binary search as a concept for creating subset data in R

Question

Binary search as a concept for creating subset data in R

I have below a data set wand a key variable xfor two cases.

Case 1:
x = 4
w = c(1,2,4,4,4,4,6,7,8,9,10,11,12,14,15)

Case2:
x = 12
w = c(1,2,4,4,4,4,6,7,8,9,10,11,12,14,15)

I want to create a function that will search xthrough a dataset wand multiply the original dataset for a lower size dataset according to xlocation in w. The result will be a smaller dataset with an upper bound value, the same as a search key. The following is the function I'm trying to write to R:

create_chunk <- function(val, tab, L=1L, H=length(tab))
{
  if(H >= L)
  {
    mid = L + ((H-L)/2)
    ## If the element is present within middle length
    if(tab[mid] > val)
    {
      ## subset the original data in reduced size and again do mid position value checking
      ## then subset the data
    } else
    {
      mid = mid + (mid/2)
      ## Increase the mid position to go for right side checking
    }
  }
}

In the output I'm looking for below:

Output for Case 1:
Dataset containing: 1,2,4,4,4,4

Output for Case 2:
Dataset containing: 1,2,4,4,4,4,6,7,8,9,10,11,12


    Please note:
    1. Dataset may contain duplicate values for search key and 
       all the duplicate values are expected in the output dataset.
    2. I have huge size datasets (say around 2M rows) from 
       where I am trying to subset smaller dataset as per my requirement of search key.

New update: case 3

Input data:

                 date    value size     stockName
1 2016-08-12 12:44:43 10093.40    4 HWA IS Equity
2 2016-08-12 12:44:38 10093.35    2 HWA IS Equity
3 2016-08-12 12:44:47 10088.00    2 HWA IS Equity
4 2016-08-12 12:44:52 10089.95    1 HWA IS Equity
5 2016-08-12 12:44:53 10089.95    1 HWA IS Equity
6 2016-08-12 12:44:54 10088.95    1 HWA IS Equity

Search Key: 10089.95in the value column.

Expected Result:

                 date    value size     stockName
1 2016-08-12 12:44:47 10088.00    2 HWA IS Equity
2 2016-08-12 12:44:54 10088.95    1 HWA IS Equity
3 2016-08-12 12:44:52 10089.95    1 HWA IS Equity
4 2016-08-12 12:44:53 10089.95    1 HWA IS Equity

+4

r binary-search dataset

Zico Sep 22 '16 at 7:12

2

, . , , x w, : -)

x <- 12
w <- c(1,2,4,4,4,4,6,7,8,9,10,11,12,14,15)

index <- which(x == w)

w_new <- w[1:index[length(index)]]
print(w_new)
#[1]  1  2  4  4  4  4  6  7  8  9 10 11 12

+2

J_F 22 . '16 7:18

989 · Accepted Answer · 2016-09-22T11:22:24+0000

, . , . , A .

binSearch <- function(A, value, left=1, right=length(A)){
  if (left > right)
    return(-1)
  middle <- (left + right) %/% 2
  if (A[middle] == value){
    while (A[middle] == value)
        middle<-middle+1
    return(middle-1)
    }
  else {
    if (A[middle] > value)
        return(binSearch(A, value, left, middle - 1))
    else
        return(binSearch(A, value, middle + 1, right))
    }
}

w[1:binSearch(w,x1)]
# [1] 1 2 4 4 4 4
w[1:binSearch(w,x2)]
# [1]  1  2  4  4  4  4  6  7  8  9 10 11 12

, , findInterval :

w[1:findInterval(x1,w)]

, log(n), , ?findInterval, log(n), :

findInterval x , vec, . , , (external (x, vec, " > =" ), 1, sum), , , O (n * log (N)), n < (x) ( N < (vec)). () x , O (n).

(, df):

o <- order(df$value)
rows <- o[1:findInterval(key, df$value[o])]
df[rows,]

, , binSearch:

o <- order(df$value)
rows <- o[1:binSearch(df$value[o], key)]
df[rows,]

< >

x1 <- 4
x2 <- 12
w <- c(1,2,4,4,4,4,6,7,8,9,10,11,12,14,15)
key <- 10089.95

Binary search as a concept for creating subset data in R

More articles: