Calculation time! =

I was wondering how much faster a!=0 less than !a==0 , and used the R-packet microobject. Here's the code (reduce 3e6 and 100 if your computer is slow):

 library("microbenchmark") a <- sample(0:1, size=3e6, replace=TRUE) speed <- microbenchmark(a != 0, ! a == 0, times=100) boxplot(speed, notch=TRUE, unit="ms", log=F) 

Each time, I get a plot like below. As expected, the first version is faster (median 26 milliseconds) than the second (33 ms).

But where do these few very high values ​​(emissions) come from? Is this some kind of memory management effect? If I set the time to 10, there will be no emissions ...

Edit: sessionInfo (): R version 3.1.2 (2014-10-31) Platform: x86_64-w64-mingw32 / x64 (64-bit)

computation time unequal and not_equal

+6
source share
2 answers

You say that you have no outliers at times=10 , but run microbenchmark with times=10 several times, and you will most likely see a strange outlier. The following is a comparison of one times=100 run with ten times=10 runs, which shows that outliers occur in both situations.

Depending on the size of the objects involved in the expression, I assume that due to lack of memory, problems may arise when your machine struggles with memory limitations. due to processes other than R.

 a <- sample(0:1, size=3e6, replace=TRUE) speed1 <- microbenchmark(a != 0, ! a == 0, times=100) speed1 <- as.data.frame(speed1) speed2 <- replicate(10, microbenchmark(a != 0, ! a == 0, times=10), simplify=FALSE) speed2 <- do.call(rbind, lapply(speed2, cbind)) times <- cbind(rbind(speed1, speed2), method=rep(1:2, each=200)) boxplot(time ~ expr + method, data=times, names=c('!=; 1x100', '!==; 1x100', '!=; 10x10', '!==; 10x10')) 

enter image description here

+2
source

I think the comparison is unfair. Of course, you will get outliers, the calculation time depends on several factors (garbage collection, caching results, etc.), so this is not a surprise. You use the same vector a in all tests, so caching will certainly play a role.

I adjusted the process a bit by randomizing the variable a before computing, and I got relatively comparable results:

 library("microbenchmark") do.not<-function() { a <- sample(0:1, size=3e6, replace=TRUE) a!=0; } do<-function() { a <- sample(0:1, size=3e6, replace=TRUE) a==0; } randomize <- function() { a <- sample(0:1, size=3e6, replace=TRUE) } speed <- microbenchmark(randomize(), do.not(), do(), times=100) boxplot(speed, notch=TRUE, unit="ms", log=F) 

Boxplot

I also added the sample function as a reference and see how this happens.

Personally, I am not surprised by emissions. Also, even if you use the same tests for size=10 , you still get outliers. They are not a consequence of the calculation, but the general condition of the PC (other scenarios, memory loading, etc.).

thanks

0
source

Source: https://habr.com/ru/post/979276/


All Articles