Format time series data for short-term forecasting using repeating neural networks

I want to predict day-ahead energy consumption using repetitive neural networks (RNNs). But I believe that the required data format (samples, time parameters, functions) for RNN is confusing. Let me explain with an example:

I have power_dataset.csv on Dropbox, which contains energy consumption from June 5 to June 18 at 10 minutes (144 observations per day). Now, to test the performance of RNN with rnn R , I follow these steps

  • train model Mfor use on June 17th using data from June 5-16.
  • predict usage on June 18th with Mand updated usage from June 6th to 17th.

My understanding of the RNN data format:

Samples: Number of samples or observations.

timesteps: Number of steps when the pattern repeats. In my case, 144 observations occur every other day, so each subsequent 144 observations make up timestamps. In other words, it determines seasonality.

functions: The number of functions that are one of my cases, that is, the time series of consumption during historical days

Accordingly, my script looks like this:

library(rnn)
df <- read.csv("power_dataset.csv")
train <- df[1:2016,] # train set from 5-16 June
test <- df[145:dim(df)[1],] # test set from 6-18 June
# prepare data to train a model
trainX <- train[1:1872,]$power # using only power column now
trainY <- train[1873:dim(train)[1],]$power
# data formatting acc. to rnn as [samples, timesteps, features]
tx <-  array(trainX,dim=c(NROW(trainX),144,1))
ty <-  array(trainY,dim=c(NROW(trainY),144,1))
model <- trainr(X=tx,Y=ty,learningrate = 0.04, hidden_dim = 10, numepochs = 100)

Error output:

The sample dimension of X is different from the sample dimension of Y.

The error occurs due to incorrect data formatting. How to format data correctly?

+2
2

:

  • # X Y , 1872 X 144 Y. , tx , 144 , .

  • RNN LSTM : Model1 10- , Model2 () .

enter image description here

# Model1
window <- 144
train <- df[1:(13*window),]$power
tx <- t(sapply(1:13, function(x) train[((x-1)*window+1):(x*window)]))
ty <- tx[2:13,]
tx <- tx[-nrow(tx),]
tx <-  array(tx,dim=c(NROW(tx),NCOL(tx),1))
ty <-  array(trainY,dim=c(NROW(ty),NCOL(ty),1))
model <- trainr(X=tx,Y=ty,learningrate = 0.01, hidden_dim = 10, numepochs = 100)
test <- sapply(2:13, function(x) train[((x-1)*window+1):(x*window)])
pred  <- predictr(model,X=array(test,dim=c(NROW(test),NCOL(test),1)))

# Model2
window <- 144
train <- df[1:(13*window),]$power
tx <- sapply(1:12, function(x) train[((x-1)*window+1):(x*window)])
ty <- train[(12*window+1):(13*window)]
tx <-  array(tx,dim=c(NROW(tx),NCOL(tx),1))
ty <-  array(trainY,dim=c(NROW(ty),1,1))
model <- trainr(X=tx,Y=ty,learningrate = 0.01, hidden_dim = 10, numepochs = 100, seq_to_seq_unsync=TRUE)
test <- sapply(2:13, function(x) train[((x-1)*window+1):(x*window)])
pred  <- predictr(model,X=array(test,dim=c(NROW(test),NCOL(test),1)))
  1. RNN LSTM . . , .
+2

"seq-to-seq-unsync = TRUE" , .

0

Source: https://habr.com/ru/post/1667902/


All Articles