How can I speed up how I check equality between numbers in R

Currently, this is how I check for numerical equality, it works if x is a numeric and y vector.

 almostEqual <- function(x, y, tolerance=1e-8) { diff <- abs(x - y) mag <- pmax( abs(x), abs(y) ) ifelse( mag > tolerance, diff/mag <= tolerance, diff <= tolerance) } 

An example :

 almostEqual(1,c(1,1.00000000000001,1.00002)) [1] TRUE TRUE FALSE 

Can you do it faster (only with base R)?

EDIT : I recommend this, which I find useful

 "%~=%" <- almostEqual; "%~in%" <- function(x,y){ sapply(x,FUN=function(a,b){any(almostEqual(a,b))},y)}; 
+4
source share
1 answer

Cutting ifelse to start will allow you to save 57% ...

 almostEqual2 <- function(x, y, tolerance=1e-8) { diff <- abs(x - y) mag <- pmax( abs(x), abs(y) ) out <- logical(length(y)) out[ mag > tolerance ] <- (diff/mag <= tolerance)[ mag > tolerance] out[ ! mag > tolerance ] <- (diff <= tolerance)[! mag > tolerance] return( out ) } require(microbenchmark) set.seed(1) x <- 1 y <- rnorm(1e6) bm <- microbenchmark( almostEqual(x,y,tol=0.5) , almostEqual2(x,y,tol=0.5) , times = 25 ) print( bm , digits = 3 , unit = "relative" , order = "median" ) #Unit: relative # expr min lq median uq max neval # almostEqual2(x, y, tol = 0.5) 1.00 1.00 1.00 1.00 1.00 25 # almostEqual(x, y, tol = 0.5) 2.09 1.76 1.73 1.86 1.82 25 

Using rcpp

I don’t understand why you wouldn’t use the most suitable for the package in CRAN outside the base , but if you would like you to be able to implement 5x acceleration of my previous effort (10 times on OP), it also correctly handles NA ...

 #include <Rcpp.h> using namespace Rcpp; //[[Rcpp::export]] LogicalVector all_equalC( double x , NumericVector y , double tolerance ){ NumericVector diff = abs( x - y ); NumericVector mag = pmax( abs(x) , abs(y) ); LogicalVector res = ifelse( mag > tolerance , diff/mag <= tolerance , diff <= tolerance ); return( res ); } 

Available using Rcpp::sourceCpp('path/to/file.cpp') . Results...

 bm <- microbenchmark( almostEqual(x,y,tol=0.5) , almostEqual2(x,y,tol=0.5) , all_equalC(x,y,tolerance=0.5) , times = 25 ) print( bm , digits = 3 , unit = "relative" , order = "median" ) #Unit: relative # expr min lq median uq max neval # all_equalC(x, y, tolerance = 0.5) 1.00 1.00 1.00 1.00 1.00 25 # almostEqual2(x, y, tol = 0.5) 4.50 4.39 5.39 5.24 7.32 25 # almostEqual(x, y, tol = 0.5) 8.69 9.34 9.24 9.96 10.91 25 
+3
source

Source: https://habr.com/ru/post/1502617/


All Articles