As the title of the question says, I would like to know why the byte compiled R code (using compiler::cmpfun ) is faster than the equivalent Rcpp code for the following math function:
func1 <- function(alpha, tau, rho, phi) { abs((alpha + 1)^(tau) * phi - rho * (1- (1 + alpha)^(tau))/(1 - (1 + alpha))) }
Since this is a simple numerical operation, I would expect Rcpp ( funcCpp and funcCpp2 ) to be much faster than the byte compiled by R ( func1c and func2c ), especially since R will have more storage overhead (1+alpha)**tau or its reprogramming. In fact, calculating this indicator two times seems faster than the memory allocation in R ( func1c vs func2c ), which seems especially contradictory, since n is large. My other suggestion is that perhaps compiler::cmpfun distracting magic, but I would like to know if this is true.
So, two things I would like to know:
FWIW, here is my data version C ++ and R
user% g++ --version Configured with: --prefix=/Library/Developer/CommandLineTools/usr --with-gxx-include-dir=/usr/include/c++/4.2.1 Apple LLVM version 7.0.0 (clang-700.0.72) Target: x86_64-apple-darwin14.3.0 Thread model: posix user% R --version R version 3.2.2 (2015-08-14) -- "Fire Safety" Copyright (C) 2015 The R Foundation for Statistical Computing Platform: x86_64-apple-darwin14.5.0 (64-bit)
And here is the R and Rcpp code:
library(Rcpp) library(rbenchmark) func1 <- function(alpha, tau, rho, phi) { abs((1 + alpha)^(tau) * phi - rho * (1- (1 + alpha)^(tau))/(1 - (1 + alpha))) } func2 <- function(alpha, tau, rho, phi) { pval <- (alpha + 1)^(tau) abs( pval * phi - rho * (1- pval)/(1 - (1 + alpha))) } func1c <- compiler::cmpfun(func1) func2c <- compiler::cmpfun(func2) func3c <- Rcpp::cppFunction(' double funcCpp(double alpha, int tau, double rho, double phi) { double pow_val = std::exp(tau * std::log(alpha + 1.0)); double pAg = rho/alpha; return std::abs(pow_val * (phi - pAg) + pAg); }') func4c <- Rcpp::cppFunction(' double funcCpp2(double alpha, int tau, double rho, double phi) { double pow_val = pow(alpha + 1.0, tau) ; double pAg = rho/alpha; return std::abs(pow_val * (phi - pAg) + pAg); }') res <- benchmark( func1(0.01, 200, 100, 1000000), func1c(0.01, 200, 100, 1000000), func2(0.01, 200, 100, 1000000), func2c(0.01, 200, 100, 1000000), func3c(0.01, 200, 100, 1000000), func4c(0.01, 200, 100, 1000000), funcCpp(0.01, 200, 100, 1000000), funcCpp2(0.01, 200, 100, 1000000), replications = 100000, order='relative', columns=c("test", "replications", "elapsed", "relative"))
And here is the output of rbenchmark :
test replications elapsed relative func1c(0.01, 200, 100, 1e+06) 100000 0.349 1.000 func2c(0.01, 200, 100, 1e+06) 100000 0.372 1.066 funcCpp2(0.01, 200, 100, 1e+06) 100000 0.483 1.384 func4c(0.01, 200, 100, 1e+06) 100000 0.509 1.458 func2(0.01, 200, 100, 1e+06) 100000 0.510 1.461 funcCpp(0.01, 200, 100, 1e+06) 100000 0.524 1.501 func3c(0.01, 200, 100, 1e+06) 100000 0.546 1.564 func1(0.01, 200, 100, 1e+06) 100000 0.549 1.573K