How can I automatically create n delays in timers?

I have a dataframe with column t. I want to create n lagging columns with names like t-1, t-2, etc.

year t t-1 t-2 19620101 1 NA NA 19630102 2 1 NA 19640103 3 2 1 19650104 4 3 2 19650104 5 4 3 19650104 6 5 4 

My idea is that I will do this in four steps:

  • Column name loop using paste
  • Loop time frames for lagging columns using paste
  • Loop to create lagging columns
  • cbind them.

But I can not continue the code. Something rude:

 df_final<-lagged(df="odd",n=3) lagged<-function(df,n){ df<-zoo(df) lags<-paste("A", 1:n, sep ="_") for (i in 1:5) { odd<-as.data.frame(lag(odd$OBS_Q,-1*i,na.pad = TRUE)) #Cbind here } 

I am stuck in writing this feature. Could you show some way? Or another easier way to do this ....

Link: The basic lag in the vector / frame R


Application:

Real data:

 x<-structure(list(DATE = 19630101:19630104, PRECIP = c(0, 0, 0,0), OBS_Q = c(1.61, 1.48, 1.4, 1.33), swb = c(1.75, 1.73, 1.7,1.67), gr4j = c(1.9, 1.77, 1.67, 1.58), isba = c(0.83, 0.83,0.83, 0.83), noah = c(1.31, 1.19, 1.24, 1.31), sac = c(1.99,1.8, 1.66, 1.57), swap = c(1.1, 1.05, 1.08, 0.99), vic.mm.day. = c(2.1,1.75, 1.55, 1.43)), .Names = c("DATE", "PRECIP", "OBS_Q", "swb","gr4j", "isba", "noah", "sac", "swap", "vic.mm.day."), class = c("data.table","data.frame"), row.names = c(NA, -4L)) 

The column that will lag is OBS_Q.

+2
source share
3 answers

If you are looking for efficiency, try the data.table new shift function from the development version < / p>

 library(data.table) # V >= 1.9.5 n <- 2 setDT(df)[, paste("t", 1:n) := shift(t, 1:n)][] # tt 1 t 2 # 1: 1 NA NA # 2: 2 1 NA # 3: 3 2 1 # 4: 4 3 2 # 5: 5 4 3 # 6: 6 5 4 

Here you can specify any name for the new columns (within paste ), and you also should not bind this back to the original, as this updates the data set by reference using the := operator function.

+7
source

I could create something around the base R embed()

 x <- c(rep(NA,2),1:6) embed(x,3) # [,1] [,2] [,3] # [1,] 1 NA NA # [2,] 2 1 NA # [3,] 3 2 1 # [4,] 4 3 2 # [5,] 5 4 3 # [6,] 6 5 4 

Maybe something like this:

 f <- function(x, dimension, pad) { if(!missing(pad)) { x <- c(rep(pad, dimension-1), x) } embed(x, dimension) } f(1:6, dimension=3, pad=NA) # [,1] [,2] [,3] # [1,] 1 NA NA # [2,] 2 1 NA # [3,] 3 2 1 # [4,] 4 3 2 # [5,] 5 4 3 # [6,] 6 5 4 
+9
source

1) lag.zoo The lag.zoo function in the zoo package can accept a delay vector. Here we want the 0th lag, -1 lag and -2 lag:

 library(zoo) cbind(DF[-2], coredata(lag(zoo(DF$t), 0:-2))) 

giving:

  year lag0 lag-1 lag-2 1 19620101 1 NA NA 2 19630102 2 1 NA 3 19640103 3 2 1 4 19650104 4 3 2 5 19650104 5 4 3 6 19650104 6 5 4 

which you have in the question, but are you sure that this is what you want? The last three lines have the same date, so the 4th line, for example, is behind the same date.

2) head . By defining a simple Lag function, we can do this using only the R base:

 Lag <- function(x, n = 1) c(rep(NA, n), head(x, -n)) # n > 0 data.frame(DF, `t-1` = Lag(DF$t), `t-2` = Lag(DF$t, 2), check.names = FALSE) 

giving:

  year t t-1 t-2 1 19620101 1 NA NA 2 19630102 2 1 NA 3 19640103 3 2 1 4 19650104 4 3 2 5 19650104 5 4 3 6 19650104 6 5 4 

Note: We used this as our data frame:

 DF <- data.frame(year = c(19620101, 19630102, 19640103, 19650104, 19650104, 19650104), t = 1:6) 
+4
source

Source: https://habr.com/ru/post/1269449/


All Articles