Calculation time! =

Question

Calculation time! =

I was wondering how much faster a!=0 less than !a==0 , and used the R-packet microobject. Here's the code (reduce 3e6 and 100 if your computer is slow):

 library("microbenchmark") a <- sample(0:1, size=3e6, replace=TRUE) speed <- microbenchmark(a != 0, ! a == 0, times=100) boxplot(speed, notch=TRUE, unit="ms", log=F)

Each time, I get a plot like below. As expected, the first version is faster (median 26 milliseconds) than the second (33 ms).

But where do these few very high values (emissions) come from? Is this some kind of memory management effect? If I set the time to 10, there will be no emissions ...

Edit: sessionInfo (): R version 3.1.2 (2014-10-31) Platform: x86_64-w64-mingw32 / x64 (64-bit)

computation time unequal and not_equal

+6

r microbenchmark

Berry boessenkool Dec 08 '14 at 11:45

source share

2 answers

jbaums · Answer 1 · 2014-12-08T12:34:54+0000

You say that you have no outliers at times=10 , but run microbenchmark with times=10 several times, and you will most likely see a strange outlier. The following is a comparison of one times=100 run with ten times=10 runs, which shows that outliers occur in both situations.

Depending on the size of the objects involved in the expression, I assume that due to lack of memory, problems may arise when your machine struggles with memory limitations. due to processes other than R.

 a <- sample(0:1, size=3e6, replace=TRUE) speed1 <- microbenchmark(a != 0, ! a == 0, times=100) speed1 <- as.data.frame(speed1) speed2 <- replicate(10, microbenchmark(a != 0, ! a == 0, times=10), simplify=FALSE) speed2 <- do.call(rbind, lapply(speed2, cbind)) times <- cbind(rbind(speed1, speed2), method=rep(1:2, each=200)) boxplot(time ~ expr + method, data=times, names=c('!=; 1x100', '!==; 1x100', '!=; 10x10', '!==; 10x10'))

Nikos · Answer 2 · 2014-12-08T13:33:12+0000

I think the comparison is unfair. Of course, you will get outliers, the calculation time depends on several factors (garbage collection, caching results, etc.), so this is not a surprise. You use the same vector a in all tests, so caching will certainly play a role.

I adjusted the process a bit by randomizing the variable a before computing, and I got relatively comparable results:

 library("microbenchmark") do.not<-function() { a <- sample(0:1, size=3e6, replace=TRUE) a!=0; } do<-function() { a <- sample(0:1, size=3e6, replace=TRUE) a==0; } randomize <- function() { a <- sample(0:1, size=3e6, replace=TRUE) } speed <- microbenchmark(randomize(), do.not(), do(), times=100) boxplot(speed, notch=TRUE, unit="ms", log=F)

I also added the sample function as a reference and see how this happens.

Personally, I am not surprised by emissions. Also, even if you use the same tests for size=10 , you still get outliers. They are not a consequence of the calculation, but the general condition of the PC (other scenarios, memory loading, etc.).

thanks

Calculation time! =

More articles: