How to find out which group failed when used group_byin a type chain dplyr. Take for example:
library(dplyr)
data(iris)
iris %>%
group_by(Species) %>%
do(mod=lm(Petal.Length ~ Petal.Width, data = .)) %>%
mutate(Slope = summary(mod)$coeff[2])
It works great. Now, if I add some problem data to iris:
iris$Petal.Width[iris$Species=="versicolor"]= NA
Thus, when trying to run a linear model, it does not work:
iris_sub <- iris[iris$Species=="versicolor",]
lm(Petal.Length ~ Petal.Width, data = iris_sub)
But if I approached this blind with a massive dataset, if I did:
iris %>%
group_by(Species) %>%
do(mod=lm(Petal.Length ~ Petal.Width, data = .)) %>%
mutate(Slope = summary(mod)$coeff[2])
This error message will not help me find out at what level the model error is:
Error in lm.fit (x, y, offset = offset, singular.ok = singular.ok, ...): 0 (non-NA) cases
I could use a loop as shown below. This, at least, allows me to find out at what level the Speciesfunction does not work. However, I would prefer to use the dplyr setting:
lmdf <- c()
for (i in unique(iris$Species)) {
cat(i, "\n")
u <- iris %>%
filter(Species==i) %>%
do(mod=lm(Petal.Length ~ Petal.Width, data = .))
lmdf = rbind(lmdf, u)
}
? , dplyr, , .
tryCatch, , . :
tryCatch (lm (v3 ~ v4, df), error = if (e $message == all_na_msg) default else stop (e)): object 'e' not found