Conversion from annual and quarterly data limited to annual average

I have several variables at the annual frequency in R that I would like to include in the regression analysis with other variables available at the quarterly frequency. In addition, I would like to be able to convert quarterly data back to annual frequency so as to reproduce the original annual data.

My current approach when converting from low-frequency and high-frequency time series data is to use the na.spline function in the zoo package. However, I do not see how to limit the quarterly data to the corresponding annual average. As a result, when I convert data from quarterly to annual frequency, I get annual values ​​that differ from the original series.

Playable example:

library(zoo)

# create annual example series
a <- as.numeric(c("100", "110", "111"))
b <- as.Date(c("2000-01-01", "2001-01-01", "2002-01-01"))
z_a <- zoo(a, b); z_a

# current approach using na.spline in zoo package
end_z <- as.Date(as.yearqtr(end(z_a))+ 3/4)
z_q <- na.spline(z_a, xout = seq(start(z_a), end_z, by = "quarter"), method = "hyman")

# result, with first quarter equal to annual value
c <- merge(z_a, z_q); c

# convert back to annual using aggregate in zoo package 
# At this point I would want both series to be equal, but they aren't. 
d <- aggregate(c, as.integer(format(index(c),"%Y")), mean, na.rm=TRUE); d

Storing the original annual data is one solution, or I could convert back by taking the value of the first quarter as annual values. But any approach adds complexity, because I will need to track which of my quarterly series was originally converted from annual data.

I would prefer a solution in zoo or xts packages, but alternative suggestions are also welcome.

, № 1, .

# Approach 1
yr <- format(time(c), "%Y")
c$z_q_adj <- ave(coredata(c$z_q), yr, FUN = function(x) x - mean(x) + x[1]); c

# simple plot
dat <- c%>%
data.frame(date=time(.), .) %>%
gather(variable, value, -date)
ggplot(data=dat, aes(x=date, y=value, group=variable, color=variable)) +
  geom_line() +
  geom_point() +
  theme(legend.position=c(.7, .4)) + 
  geom_point(data = subset(dat,variable == "z_a"),  colour="red", shape=1, size=7)

, . , 1, , Q4 Q1 (, 2001Q1 , ). . , , 1, . , .

:

+4
3

, tempdisagg , . , , , , .

, , Chow-Lin. , Denton-Cholette , Eviews.

:

# need ts object as input
z_a <- ts(c(100, 110, 111), start = 2000)

library(tempdisagg)
z_q <- predict(td(z_a ~ 1, method = "denton-cholette", conversion = "average"))

z_q
#           Qtr1      Qtr2      Qtr3      Qtr4
# 2000  97.65795  98.59477 100.46841 103.27887
# 2001 107.02614 109.71460 111.34423 111.91503
# 2002 111.42702 111.06100 110.81699 110.69499

# which has the same means as your original series:

tapply(z_q, floor(time(z_q)), mean)
# 2000 2001 2002 
#  100  110  111 
+1

na.spline, , , 4 . 4 , . 3 3 .

z_q_adj .

:

# 1
yr <- format(time(c), "%Y")
c$z_q_adj <- ave(coredata(c$z_q), yr, FUN = function(x) x - mean(x) + x[1])

:

> c
           z_a      z_q   z_q_adj
2000-01-01 100 100.0000  95.36604
2000-04-01  NA 103.4434  98.80946
2000-07-01  NA 106.4080 101.77405
2000-10-01  NA 108.6844 104.05046
2001-01-01 110 110.0000 109.39295
2001-04-01  NA 110.5723 109.96527
2001-07-01  NA 110.8719 110.26484
2001-10-01  NA 110.9840 110.37694
2002-01-01 111 111.0000 110.86116
2002-04-01  NA 111.0150 110.87615
2002-07-01  NA 111.1219 110.98311
2002-10-01  NA 111.4184 111.27958


# 2
c$z_q_adj <- ave(coredata(c$z_q), yr, FUN = function(x) c(x[1], x[-1] - mean(x[-1]) +x[1]))

:

> c
           z_a      z_q  z_q_adj
2000-01-01 100 100.0000 100.0000
2000-04-01  NA 103.4434  97.2648
2000-07-01  NA 106.4080 100.2294
2000-10-01  NA 108.6844 102.5058
2001-01-01 110 110.0000 110.0000
2001-04-01  NA 110.5723 109.7629
2001-07-01  NA 110.8719 110.0625
2001-10-01  NA 110.9840 110.1746
2002-01-01 111 111.0000 111.0000
2002-04-01  NA 111.0150 110.8299
2002-07-01  NA 111.1219 110.9368
2002-10-01  NA 111.4184 111.2333

. , , :

  • , . comment(c) <- "Originally annual",

  • , . _a , : c_a <- c,

  • , c_q c_q_adj, , ,

  • ,

0

, - , , , mean aggregate min?

 > d <- aggregate(c, as.integer(format(index(c),"%Y")), min, na.rm=TRUE)
 > d
      z_a z_q
 2000 100 100
 2001 110 110
 2002 111 111
0

Source: https://habr.com/ru/post/1608834/


All Articles