How to add time series objects (ts) to data.table, line by line?

I am trying to save ts objects line by line. Monthly data (24 monthly values ​​for 1980 and 1981) To create time series are stored in row order in DT, so I just want to add a column in DT to store ts () objects for each row. Here is a reproducible example in which I tried Three different options, but none of them work as I expected.

library(data.table) DT <- data.table(ID=seq(1:10), JAN_1980=rnorm(1:10),FEB_1980=rnorm(1:10),MAR_1980=rnorm(1:10),APR_1980=rnorm(1:10),MAY_1980=rnorm(1:10),JUN_1980=rnorm(1:10),JUL_1980=rnorm(1:10),AUG_1980=rnorm(1:10),SEP_1980=rnorm(1:10),OCT_1980=rnorm(1:10),NOV_1980=rnorm(1:10),DEC_1980=rnorm(1:10),JAN_1981=rnorm(1:10),FEB_1981=rnorm(1:10),MAR_1981=rnorm(1:10),APR_1981=rnorm(1:10),MAY_1981=rnorm(1:10),JUN_1981=rnorm(1:10),JUL_1981=rnorm(1:10),AUG_1981=rnorm(1:10),SEP_1981=rnorm(1:10),OCT_1981=rnorm(1:10),NOV_1981=rnorm(1:10),DEC_1981=rnorm(1:10)) # First attempt DT[,TS_COL:=ts(.SD[,2:25,with=FALSE], start=c(1980,1), frequency=12)] # Second DT[,TS_COL:=ts(unlist(.SD[,2:25,with=FALSE]), start=c(1980,1), frequency=12)] # Third DT[,TS_COL:=list(list(list(ts(unlist(.SD[,2:25,with=FALSE]), start=c(1980,1), frequency=12))))] 

I would like to have access to the ts object for a specific line this way (until I succeed):

 DT[1,TS_COL] 

... and get something like (2 years of monthly data):

  Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec 1980 2.13303849 0.74954206 -0.45112504 2.13558888 1.11883498 -0.39074470 1.77374480 -0.19513901 0.49920019 -1.12875185 0.45598049 1.97730211 1981 0.62764761 -0.86330094 -0.51585664 0.59677770 -0.71073980 -0.26208961 -0.38833227 1.39841244 -1.50490225 -0.72018921 1.06684672 0.07126184 

Any hint on how to achieve this?

+5
source share
1 answer

I don't remember ever using ts() . I tend to have irregular time series stored in long format. Either one datetime column, or a separate date and time column (to go to the prevailing observation during the day, but not until the previous day). Then I create regular intermediate rows and join this data, or find the beginning and end of windows using which and roll , and extract a subset for that window.

However, try with ts() .

Please include an error message or warning in your question. See paragraphs 6 and 7 on the Support page. Your example is not reproducible; for example, I receive the following warnings, but it is possible that you get another warning (you did not turn it on, so there’s nothing to try to reproduce). None of the examples are minimal, because we do not need 20 columns that wrap the console output.

 DT[,TS_COL:=ts(.SD[,2:25,with=FALSE], start=c(1980,1), frequency=12)] # Warning messages: # 1: In `[.data.table`(DT, , `:=`(TS_COL, ts(.SD[, 2:25, with = FALSE], : # 24 column matrix RHS of := will be treated as one vector # 2: In `[.data.table`(DT, , `:=`(TS_COL, ts(.SD[, 2:25, with = FALSE], : # Supplied 240 items to be assigned to 10 items of column 'TS_COL' (230 unused) 

First of all, let's take a look at the manual. ?ts contains the following signature:

ts (data = NA, start = 1, end = numeric (), frequency = 1, deltat = 1, ts.eps = getOption ("ts.eps"), class =, names =)

You are using the first argument to data , which is why it says:

: vector or matrix of observed values ​​of time series. The data frame will be forced into the numeric matrix through data.matrix. (See also "Details.)

Since data.table inherits from data.frame, it is also a data.frame file. Therefore, the data table will be forced into the matrix.

Next we see something about the matrix:

In the matrix case, each column of matrix data is supposed to contain one (one-dimensional) time series.

Now let me break down the problem and check the RHS that she is trying to assign. Just delete the TS_COL:= and run it again to return RHS so we can look at it.

 RHS = DT[,ts(.SD[,2:25,with=FALSE], start=c(1980,1), frequency=12)] class(RHS) # [1] "mts" "ts" "matrix" dim(RHS) # [1] 10 24 dim(DT) # [1] 10 26 length(RHS) # [1] 240 storage.mode(RHS) # [1] "double" 

So this is the matrix. And worse is double , not integer . (Recall that we don’t like Date in the database either for use in data.table, because, oddly enough, Date less than double than integer .)

You cannot save a matrix as a column in a data table. data.table treats the matrix as a vector, which is internally referenced by warning messages (shown above in this answer). Here is another warning:

 24 column matrix RHS of := will be treated as one vector Supplied 240 items to be assigned to 10 items of column 'TS_COL' (230 unused) 

These warnings are generated using data.table code and are pretty good, I think.

So, if you need to continue using the ts() class as the data.table column, you need to either force the matrix to a list of 24 columns (24 vectors of all 10 lengths), and not the matrix of 24 columns (internal vector 240 in length).

But at the moment it seems that the ts() class is not the right tool for the job. What do you really need to do? It is better to back up and describe what the big picture is.

+8
source

Source: https://habr.com/ru/post/1240118/


All Articles