Apply lm to the subset of the data frame defined by the third column of the frame

I have a data frame containing a vector of x values, a vector of y values, and an identifier vector:

x <- rep(0:3, 3) y <- runif(12) ID <- c(rep("a", 4), rep("b", 4), rep("c", 4)) df <- data.frame(ID=ID, x=x, y=y) 

I would like to create a separate lm for a subset of x and y having the same identifier. The following code does the job:

 a.lm <- lm(x~y, data=subset(df, ID=="a")) b.lm <- lm(x~y, data=subset(df, ID=="b")) c.lm <- lm(x~y, data=subset(df, ID=="c")) 

Except that it is very fragile (there may be different identifiers in future data sets) and without vectorization. I would also like to keep all lms in one data structure. There must be an elegant way to do this, but I cannot find it. Any help?

+6
source share
3 answers

What about

 library(nlme) ## OR library(lme4) lmList(x~y|ID,data=d) 

?

+7
source

Using the base functions, you can split create your original framework and use lapply to do this:

 lapply(split(df,df$ID),function(d) lm(x~y,d)) $a Call: lm(formula = x ~ y, data = d) Coefficients: (Intercept) y -0.2334 2.8813 $b Call: lm(formula = x ~ y, data = d) Coefficients: (Intercept) y 0.7558 1.8279 $c Call: lm(formula = x ~ y, data = d) Coefficients: (Intercept) y 3.451 -7.628 
+10
source

Use magic in the plyr package. The dlply function takes data.frame , splits it, applies the function to each element, and combines it into a list . This is perfect for your application.

 library(plyr) #fitList <- dlply(df, .(ID), function(dat)lm(x~y, data=dat)) fitList <- dlply(df, .(ID), lm, formula=x~y) # Edit 

This creates a list with a model for each subset of IDs:

 str(fitList, max.level=1) List of 3 $ a:List of 12 ..- attr(*, "class")= chr "lm" $ b:List of 12 ..- attr(*, "class")= chr "lm" $ c:List of 12 ..- attr(*, "class")= chr "lm" - attr(*, "split_type")= chr "data.frame" - attr(*, "split_labels")='data.frame': 3 obs. of 1 variable: 

This means that you can multiply the list and work with it. For example, to get the coefficients for your lm model, where ID=="a" :

 > coef(fitList$a) (Intercept) y 3.071854 -3.440928 
+7
source

Source: https://habr.com/ru/post/897309/


All Articles