In general, for such tasks, you can translate your function one on one in C ++ using the Rcpp package. This should give significant acceleration.
Firstly, version R:
random_sum <- function(loop = 1000) { x<-c(0,0) z<-0 for(i in 2:loop) { x[1]<-x[2] x[2]<-x[1]+rnorm(1, 0, 1) if (x[2]<0) {z<-z+1} } z / loop } set.seed(123) random_sum()
Now C ++ version:
library("Rcpp") cppFunction(" double random_sum_cpp(unsigned long loop = 1000) { double x1 = 0; double x2 = 0; double z = 0; for (unsigned long i = 2; i < loop; i++) { x1 = x2; x2 = x1 + Rcpp::rnorm(1)[0]; if (x2 < 0) z = z+1; } return z/loop; }") set.seed(123) random_sum_cpp() # [1] 0.134
For completeness, we also consider the proposed vector version:
random_sum_vector <- function(loop = 1000) { Y = rnorm(loop) sum(cumsum(Y)<0)/loop } set.seed(123) random_sum_vector()
We see that it gives the same result for the same random seed, so it seems like a viable rival.
In the reference version, the C ++ version and the vectorized version are performed in a similar way, with the vectorized version showing a slight advantage over the C ++ version:
> microbenchmark(random_sum(100000), random_sum_vector(100000), random_sum_cpp(100000)) Unit: milliseconds expr min lq mean median uq max neval random_sum(1e+05) 184.205588 199.859266 209.220232 205.137043 211.026740 274.47615 100 random_sum_vector(1e+05) 6.320690 6.631704 7.273645 6.799093 7.334733 18.48649 100 random_sum_cpp(1e+05) 8.950091 9.362303 10.663295 9.956996 11.079513 21.30898 100
However, the vector version speeds up memory work and will explode your memory for long cycles. In C ++, there is practically no memory.
For 10 ^ 9 steps, the C ++ version works for about 2 minutes (110 seconds) on my machine. I have not tried version R. Based on shorter tests, this will probably take about 7 hours.
> microbenchmark(random_sum_cpp(10^9), times = 1) Unit: seconds expr min lq mean median uq max neval random_sum_cpp(10^9) 110.2182 110.2182 110.2182 110.2182 110.2182 110.2182 1