Improving performance in multiple Time-Range subsets of xts?

Is there a better way for the following code:

slice.periods <- function (x, periods, ...)
{
  if (!require("xts")) {
    stop("Need 'xts'")
  }
  Reduce(rbind.xts, lapply(periods, function(t) x[t], ...))
}

where x is the xts object, periods is an iterable list of the charter that is recognized by the subset of xts. Usage example:

j <- xts(rnorm(10e6),Sys.time()-(10e6:1))
v <- c("T10:00/T11:00", "T13:00/T15:00", "T20:30/T22:00")
system.time(slice.periods(j, v))

## result on my MacBook Air (1.8 GHz Intel Core i7; 4 GB 1333 MHz DDR3)
##  user  system elapsed 
## 14.956   0.876  15.837 

There are several problems:

  • Reduce, perhaps too slowly, if each subset is very large.
  • Each time slice is not optimal because it does not directly use indexing from the xts object. If the object is large, it can be expensive.

I saw some reports that if time is in UTC, there are some amazing speedups through direct access, see the following post: data.table time subset vs xts subset of time

. UTC , .

data.table, , "rbindlist" do.Call(rbind,...) Reduce (rbind,...). , data.table , . , rbindlist as.data.table xts , , data.table .

, . .

+4
1

, rbind.xts, .

jv <- j[unlist(lapply(v, function(i) j[i, which.i=TRUE])),]

, UTC, , xts POSIXct POSIXlt, .

+4

Source: https://habr.com/ru/post/1526589/


All Articles