What is the fastest way in R to calculate the maximum rolling speed with a variable rolling window size?

I have 2 numerical vectors, one stores the values ​​for calculating the maximum, the other lengths of the rolling window for calculating these maxima based on the rental. The following is sample code. I usually try to speed up the code inside system.time. Is there some kind of ready-made function or a vectorized way to do the same?

a <- rep(1:5,20000) set.seed(123) b <- rep(sample(1:50),2000) system.time({ out <- vector(mode='numeric', length=NROW(a)) for(i in seq(a)) { if (ib[i]>=0) out[i] <- max(a[(ib[i]+1):i]) else out[i] <- NA } }) 
+6
source share
2 answers

Managed to vectorize its parts:

Original -

 system.time({ out <- vector(mode='numeric', length=NROW(a)) for(i in seq(a)) { if (ib[i]>=0) out[i] <- max(a[(ib[i]+1):i]) else out[i] <- NA } }) ## user system elapsed ## 0.64 0.00 0.64 

A bit vectorized -

 system.time({ nr <- NROW(a) out <- rep(NA,nr) m <- 1:nr - b + 1 n <- (1:nr)[m>0] for(i in n) out[i] <- max(a[m[i]:i]) }) ## user system elapsed ## 0.39 0.00 0.39 
+1
source

You can vectorize parts of this problem, especially if you need to find the starting position of the index in a (I called it str ) and the end of the window ( end ), but I have to use a loop construct to apply these index positions to a to take max using mapply . For instance:

 x <- seq_len( length(a) ) end <- which( xb > 0 ) str <- end - b[end] res <- a res[ - end ] <- NA res[end] <- mapply( function(x,y) max( a[ x:y ] ) , str , end ) 

And comparing with @ e4e5f4 answer:

 identical( res , out ) [1] TRUE 

However, this is not so fast:

 user system elapsed 0.46 0.00 0.47 

If there was a way to vectorize the last operation, it would be very fast, but I can’t think of any way to do it at the moment!

0
source

Source: https://habr.com/ru/post/943689/


All Articles