Modeling time series with irregular data

Question

Modeling time series with irregular data

I am currently working on a pet project to forecast future base oil prices from historical base oil prices. The data is weekly, but there are gaps between where prices are not available.

I'm a bit alright with modeling time series with full data, but when it comes to irregular ones, the models that I learned may not be applicable. Do I use the xts class and usually use ARIMA models in R?

After you have built a model for forecasting future prices, I would like to take into account fluctuations in oil prices, profit margins from diesel fuel, car sales, economic growth, etc. (Multivariable?) To increase accuracy. Can someone shed some light on how I can do this in an effective way? In my opinion, it looks like a maze.

EDIT: Cropped data here: https://docs.google.com/document/d/18pt4ulTpaVWQhVKn9XJHhQjvKwNI9uQystLL4WYinrY/edit

Coding:

Mod.fit<-arima(Y,order =c(3,2,6), method ="ML")

Result: Warning message: In log (s2): NaN generated

Will this warning affect my model accuracy?

In the absence of data, I cannot use ACF and PACF. Is there a better way to select models? I used AIC (Akaike Information Criterion) to compare different ARIMA models using this code. ARIMA (3,2,6) gave the smallest AIC.

Coding:

 AIC<-matrix(0,6,6) for(p in 0:5) for(q in 0:5) { mod.fit<-arima(Y,order=c(p,2,q)) AIC[p+1,q+1]<-mod.fit$aic p } AIC

Result:

  [,1] [,2] [,3] [,4] [,5] [,6] [1,] 1396.913 1328.481 1327.896 1328.350 1326.057 1325.063 [2,] 1343.925 1326.862 1328.321 1328.644 1325.239 1318.282 [3,] 1334.642 1328.013 1330.005 1327.304 1326.882 1314.239 [4,] 1336.393 1329.954 1324.114 1322.136 1323.567 1316.150 [5,] 1319.137 1321.030 1320.575 1321.287 1323.750 1316.815 [6,] 1321.135 1322.634 1320.115 1323.670 1325.649 1318.015

+6

r time-series modeling forecasting

leejy Nov 18 '11 at 9:50

source share

1 answer

Gavin simpson · Answer 1 · 2011-11-18T10:32:58+0000

No, in general, you do not need to use xts and then run ARIMA, an additional step is required. Missing values written as NA are processed by arima() , and if you use method = "ML" , then they will be processed exactly; other methods cannot get innovations regarding missing data. This works because arima() matches the ARIMA model in the state representation of space.

If the data is regular but there is no data, then this should be good.

The reason I say that xts is not used at all is because arima() requires a one-dimensional time series object ?ts as input. However, xts extends and inherits zoo objects, and the zoo package provides the as.ts() method for objects of the "zoo" class. Therefore, if you get your data in a zoo() or xts() object, you can force it to the "ts" class, and it should include NA in the appropriate places that arima() will process if possible (i.e. not many missing values).

Modeling time series with irregular data

More articles: