Predictions based on level data (with group models)

Question

Predictions based on level data (with group models)

I desperately need help: so I use dplyr to run group regressions. Something like that:

regressions <- mtcars %>% group_by(cyl) %>%
do(fit = lm(wt ~ mpg + qsec + gear, .))

and I get the models in a data frame that looks like this:

  ##     cyl     fit
  ##   (dbl)   (chr)
  ## 1     4 <S3:lm>
  ## 2     6 <S3:lm>
  ## 3     8 <S3:lm>

Now I want to predict new data that is shorter (i.e. not the same amount as the training data), and which have the same levels. Ie 4.6.8 for cyl. Then my question is: how can I predict the use of new / testdata so that each model refers only to their level in my test suite.

 so model cyl 4 only uses data 4 cyl to predict 
model cyl 6 uses data 6 cyl to predict
model cyl 8 uses data 8 cyl to predict
and so on and so forth.enter code here

Please keep in mind that the test data contains all levels / groups.

. I. . , . : , .

, ! .

+4

r

Alice Work 29 . '16 16:37

3

lm data.frame, :

A <- list()
for (i in unique(mtcars$cyl)) {
  A[[as.character(i)]] <- predict(as.list(regressions[regressions$cyl == i, ])$fit[[1]],
                    newdata = mtcars[mtcars$cyl == i, ])
}

(, ) .

reg <- list()
pred <- list()
for (cyl in unique(mtcars$cyl)) {
  reg[[as.character(cyl)]] <- lm(wt ~ mpg + qsec + gear, filter(mtcars, cyl == cyl))
  pred[[as.character(cyl)]] <- predict(reg[[as.character(cyl)]],
                                       newdata = filter(mtcars, cyl == cyl))
}

, lapply unqieu(mtcars$cyl) . as.character , , , .

, *, cyl, , . , . , cyl factor, , . .

mtcars$cyl <- factor(mtcars$cyl)
reg <- lm(wt ~ (1 + mpg + qsec + gear)*cyl, mtcars)
predict(reg, mtcars)

, (.. mpg cyl = 6 - mpg mpg:cyl6)

0

Choubi 29 . '16 17:09

broom::augment.

:

library(broom)
library(dplyr)

# fit the set of regressions by cyl
regressions = mtcars %>% group_by(cyl) %>%
  do(fit = lm(wt ~ mpg + qsec + gear, .))

# score the regressions by cyl
scores = regressions %>% 
  augment(fit)

You can verify that the results of this will be the same as the results of the individual regression and evaluation for the groups defined by the values cyl.

# check that regression with cyl == 4 and predictions gives the same result
lm_4 = lm(wt ~ mpg + qsec + gear, data = subset(mtcars, cyl == 4))
predict(lm_4, newdata = subset(mtcars, cyl == 4))
scores %>% 
  filter(cyl == 4)

# check that regression with cyl == 8 and predictions gives the same result
lm_8 = lm(wt ~ mpg + qsec + gear, data = subset(mtcars, cyl == 8))
predict(lm_8, newdata = subset(mtcars, cyl == 8))
scores %>% 
  filter(cyl == 8)

0

tchakravarty Aug 29 '16 at 18:53

source share

aosmith · Accepted Answer · 2016-08-29T17:25:55+0000

purrr dplyr tidyr. purrr , , do.

, , mtcars_test.

mtcars_test = mtcars

cyl.

test_split = split(mtcars_test, mtcars_test$cyl)

map2 .

library(purrr)

map2(regressions$fit, test_split, predict)

- . . purrr mutate plus tidyr::nest, :

library(tidyr)

regs = mtcars %>%
    group_by(cyl) %>%
    nest() %>%
    mutate(fit = map(data, ~lm(wt ~ mpg + qsec + gear, .)))

map2, , mutate. do, .

regs %>% 
    mutate(testpred = map2(fit, test_split, predict))

, tidyr::unnest.

regs %>% 
    mutate(testpred = map2(fit, test_split, predict)) %>%
    unnest(testpred)

# A tibble: 32 × 2
     cyl testpred
   <dbl>    <dbl>
1      6 3.607719
2      6 4.263550
3      6 5.418092
4      6 4.386157
5      6 3.898692
6      6 4.632542
...

Predictions based on level data (with group models)

More articles: