If you have a multi-core machine, there are some advantages to using all cores, for example, using mclapply
.
> library(multicore) > M <- matrix(rnorm(40),nrow=20) > x1 <- apply(M, 2, t.test) > x2 <- mclapply(1:dim(M)[2], function(i) t.test(M[,i])) > all.equal(x1, x2) [1] "Component 1: Component 9: 1 string mismatch" "Component 2: Component 9: 1 string mismatch"
This mini-example shows that everything is going as we planned. Now zoom in:
> M <- matrix(rnorm(1e7), nrow=20) > system.time(invisible(apply(M, 2, t.test))) user system elapsed 101.346 0.626 101.859 > system.time(invisible(mclapply(1:dim(M)[2], function(i) t.test(M[,i])))) user system elapsed 55.049 2.527 43.668
This is the use of 8 virtual cores. Your mileage may vary. Not a huge gain, but it brings very little effort.
EDIT
If you care only about t-statistics itself, extracting the corresponding field ( $statistic
) makes things a little faster, in particular in the multi-core case:
> system.time(invisible(apply(M, 2, function(c) t.test(c)$statistic))) user system elapsed 80.920 0.437 82.109 > system.time(invisible(mclapply(1:dim(M)[2], function(i) t.test(M[,i])$statistic))) user system elapsed 21.246 1.367 24.107
Or even faster, calculate the value of t directly
my.t.test <- function(c){ n <- sqrt(length(c)) mean(c)*n/sd(c) }
Then
> system.time(invisible(apply(M, 2, function(c) my.t.test(c)))) user system elapsed 21.371 0.247 21.532 > system.time(invisible(mclapply(1:dim(M)[2], function(i) my.t.test(M[,i])))) user system elapsed 144.161 8.658 6.313