Given a data.frame that contains time series and one or ore grouping of fields. Thus, we have several time series - one for each grouping. But some dates are missing. So what is the easiest (in terms of the "tidyverse way") adding these dates with the correct grouping values?
Normally I would say that I am creating data.frame with all dates and doing full_join with my time series. But now we have to do this for each combination of grouping values ââ- and fill in the grouping values.
Let's look at an example:
First, I create data.frame with missing values:
library(dplyr) library(lubridate) set.seed(1234)
So, to add the missing dates, I create a data.frame with all dates:
start <- min(df.missing$date) end <- max(df.missing$date) all.dates <- data.frame(date=seq.Date(start, end, by="day"))
No. I want to do something like (remember: df.missing - group_by (d1, d2))
df.missing %>% do(my_join())
So let's define my_join ():
my_join <- function(data) {
Now we can call my_join () for each combination and see "A / 5"
df.missing %>% do(my_join(.)) %>% filter(d1 == "A" & d2 == 5)
Excellent! This is what we were looking for. But we need to define d1 and d2 in my_join, and it feels a little awkward.
So is there any way back to this solution?
PS: I put the code to the point: https://gist.github.com/JerryWho/1bf919ef73792569eb38f6462c6d7a8e