See @eddi's answer for a quicker solution (for this specific problem). It also works when x1 not an integer.
The algorithm you are looking for is Interval Tree . And there is a bioconductor package called IRanges that performs this task. It's hard to beat.
require(IRanges) require(data.table) my.df[, res := countOverlaps(IRanges(my.df$x1, width=1), IRanges(my.df$x1-tol+1, my.df$x1+tol-1))]
Some explanation:
If you add the code, you can write it in three lines:
ir1 <- IRanges(my.df$x1, width=1) ir2 <- IRanges(my.df$x1-tol+1, my.df$x1+tol-1) cnt <- countOverlaps(ir1, ir2)
Essentially, we have to create two โrangesโ (just enter ir1 and ir2 to see how they are). Then we ask, for each record in ir1 , how much they overlap in ir2 (this is part of the interval tree). And it is very effective. The type argument is countOverlaps , the default is type = any. You can learn other types if you want. This is extremely helpful. The findOverlaps function is also relevant.
Note: there may be faster solutions (in fact, see @eddi) for this particular case, where the width is ir1 = 1. But for tasks where the widths are variable and / or> 1, this should be the fastest.
Benchmarking:
ag <- function(my.df) my.df[, res := sum(abs(my.df$x1-x1) < tol), by=x1] ro <- function(my.df) { my.df[,res:= { y = my.df$x1 sum(y > (x1 - tol) & y < (x1 + tol)) }, by=x1] } ar <- function(my.df) { my.df[, res := countOverlaps(IRanges(my.df$x1, width=1), IRanges(my.df$x1-tol+1, my.df$x1+tol-1))] } require(microbenchmark) microbenchmark(r1 <- ag(copy(my.df)), r2 <- ro(copy(my.df)), r3 <- ar(copy(my.df)), times=100) Unit: milliseconds expr min lq median uq max neval r1 <- ag(copy(my.df)) 33.15940 39.63531 41.61555 44.56616 208.99067 100 r2 <- ro(copy(my.df)) 69.35311 76.66642 80.23917 84.67419 344.82031 100 r3 <- ar(copy(my.df)) 11.22027 12.14113 13.21196 14.72830 48.61417 100 <~~~ identical(r1, r2)