How can I apply grouped data to grouped models using a broom and dplyr?

Question

How can I apply grouped data to grouped models using a broom and dplyr?

I would like to make the equivalent of setting the gpm model (gallons per mile = 1 / mpg) to wt in the mtcars dataset. This seems easy:

data(mtcars)
library(dplyr)
library(tidyr)
library(broom)
library(ggplot2)
library(scales)

mtcars2 <-
    mtcars %>%
    mutate(gpm = 1 / mpg) %>%
    group_by(cyl, am)

lm1 <-
    mtcars2 %>%
    do(fit = lm(gpm ~ wt, data = .))

This allows me to create a data frame with six rows, as expected.

This graph confirms that there are six groups:

p1 <-
    qplot(wt, gpm, data = mtcars2) +
    facet_grid(cyl ~ am) +
    stat_smooth(method='lm',se=FALSE, fullrange = TRUE) +
    scale_x_continuous(limits = c(0,NA))

I can use augment () to get the installed outputs:

lm1 %>% augment(fit)

This gives me 32 lines, one for each line in mtcars2, as expected.

Now the task: I want to get ready-made outputs using newdata, where I increased wt by cyl / 4:

newdata <-
    mtcars2 %>%
    mutate(
        wt = wt + cyl/4)

I expect this to create a data frame of the same size as lm1%>% augment (fit): one row for each row in newdata, because the broom will match the models and newdata using the grouping variables cyl and am.

,

pred1 <-
    lm1 %>%
    augment(
        fit,
        newdata = newdata)

192 (= 6 x 32), -, newdata.

, , frame_by rolsise , lm1 , newdata. , ? , , , , .

sessionInfo():

> sessionInfo()
R version 3.3.1 (2016-06-21)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 7 x64 (build 7601) Service Pack 1

locale:
[1] LC_COLLATE=English_United States.1252 
[2] LC_CTYPE=English_United States.1252   
[3] LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C                          
[5] LC_TIME=English_United States.1252    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] scales_0.4.0  ggplot2_2.1.0 broom_0.4.1   tidyr_0.6.0   dplyr_0.5.0  

loaded via a namespace (and not attached):
 [1] Rcpp_0.12.7      magrittr_1.5     mnormt_1.5-4     munsell_0.4.3   
 [5] colorspace_1.2-6 lattice_0.20-34  R6_2.1.3         stringr_1.1.0   
 [9] plyr_1.8.4       tools_3.3.1      parallel_3.3.1   grid_3.3.1      
[13] nlme_3.1-128     gtable_0.2.0     psych_1.6.9      DBI_0.5-1       
[17] lazyeval_0.2.0   assertthat_0.1   tibble_1.2       reshape2_1.4.1  
[21] labeling_0.3     stringi_1.1.1    compiler_3.3.1   foreign_0.8-67

EDIT:

@aosmith: , . , , mutate: ": , ".

:

newdata %>% 
dplyr::select(cyl, am, wt) %>% # wt holds new predictor values
group_by(cyl, am) %>%
nest() %>%
inner_join(regressions, .) %>% 
## looks like yours at this point
mutate(pred = list(augment(fit, newdata = data))) %>% # Error here
unnest(pred)

, , , ( ): ID (chr), attr1 (dbl), cyl (dbl), am (chr), fit (list) (). , am (dbl), . am dbl, .

, , 3 (ID..., mtcars) x 2 (cyl) x 2 (am) ( 12 ), mtcars 3 () x 2 (am) xa . ID, newdata . , , . , ?

EDIT: newdata ( full = TRUE) . .

+4

r dplyr broom

BillH 03 . '16 17:13

1

aosmith · Accepted Answer · 2016-10-03T18:29:19+0000

map2 purrr . map2 . .

, (augment, ). ( cyl/am).

map2_df data.frame .

library(purrr)

data.frames split. , , , lm1.

test_split = split(newdata, list(newdata$am, newdata$cyl)

map2_df(lm1$fit, test_split, ~augment(.x, newdata = .y))

, nest , lm1 augment .

newdata %>%
    group_by(cyl, am) %>%
    nest() %>%
    inner_join(lm1, .) %>%
    mutate(pred = list(augment(fit, newdata = data))) %>%
    unnest(pred)

How can I apply grouped data to grouped models using a broom and dplyr?

More articles: