I am currently working on a pet project to forecast future base oil prices from historical base oil prices. The data is weekly, but there are gaps between where prices are not available.
I'm a bit alright with modeling time series with full data, but when it comes to irregular ones, the models that I learned may not be applicable. Do I use the xts class and usually use ARIMA models in R?
After you have built a model for forecasting future prices, I would like to take into account fluctuations in oil prices, profit margins from diesel fuel, car sales, economic growth, etc. (Multivariable?) To increase accuracy. Can someone shed some light on how I can do this in an effective way? In my opinion, it looks like a maze.
EDIT: Cropped data here: https://docs.google.com/document/d/18pt4ulTpaVWQhVKn9XJHhQjvKwNI9uQystLL4WYinrY/edit
Coding:
Mod.fit<-arima(Y,order =c(3,2,6), method ="ML")
Result: Warning message: In log (s2): NaN generated
Will this warning affect my model accuracy?
In the absence of data, I cannot use ACF and PACF. Is there a better way to select models? I used AIC (Akaike Information Criterion) to compare different ARIMA models using this code. ARIMA (3,2,6) gave the smallest AIC.
Coding:
AIC<-matrix(0,6,6) for(p in 0:5) for(q in 0:5) { mod.fit<-arima(Y,order=c(p,2,q)) AIC[p+1,q+1]<-mod.fit$aic p } AIC
Result:
[,1] [,2] [,3] [,4] [,5] [,6] [1,] 1396.913 1328.481 1327.896 1328.350 1326.057 1325.063 [2,] 1343.925 1326.862 1328.321 1328.644 1325.239 1318.282 [3,] 1334.642 1328.013 1330.005 1327.304 1326.882 1314.239 [4,] 1336.393 1329.954 1324.114 1322.136 1323.567 1316.150 [5,] 1319.137 1321.030 1320.575 1321.287 1323.750 1316.815 [6,] 1321.135 1322.634 1320.115 1323.670 1325.649 1318.015