A subset of time.table a subset vs xts a subset of time

Question

A subset of time.table a subset vs xts a subset of time

Hey. I want to pick up some time data. Usually I use xtssomething like:

subset.string <- 'T10:00/T13:00' 
xts.min.obj[subset.string]

to get all the lines that are between 10:00 and 13:00 (inclusive) EVERY DAY, and are displayed in xts format. But for my purposes a little slower ... for example

j <- xts(rnorm(10e6),Sys.time()-(10e6:1))
system.time(j['T10:00/T16:00'])
   user  system elapsed 
  5.704   0.577  17.115

I know that it data.tableis fast and a subset of large datasets, so it’s interesting if, in combination with the fasttimePOSIXct package for working with fast creatures, if it were worth creating a function like

dt.time.subset <- function(xts.min.obj, subset.string){
  require(data.table)
  require(fasttime)
  x.dt <- data.table(ts=format(index(xts.min.obj),"%Y-%m-%d %H:%M:%S %Z"),
                     coredata(xts.min.obj))
  out <- x.dt[,some.subsetting.operation.using."%between%"]
  xts(out,fastPOSIXct(out[,ts])
}

xts.min.obj . , data.table , , xts? , C?

+5

benchmarking r xts data.table

h.l.m 27 . '13 14:30

1

eddi · Accepted Answer · 2013-06-27T16:22:40+0000

UTC, :

j[(.index(j) %% 86400) %between% c(10*3600, 16*3600 + 60)]
# +60 because xts includes that minute; you'll need to offset the times
# appropriately to match with xts unless you live in UTC :)

j <- xts(rnorm(10e6),Sys.time()-(10e6:1))
system.time(j[(.index(j) %% 86400) %between% c(10*3600, 16*3600 + 60)])
#  user  system elapsed 
#  1.17    0.08    1.25 
# likely faster on your machine as mine takes minutes to run the OP bench

A subset of time.table a subset vs xts a subset of time

More articles: