Multilevel regression model for multiplied data in R (Amelia, zelig, lme4)

I am trying to run a layered model for multiple imputed data (created using Amelia); the sample is based on a cluster sample with a group = 24, N = 150.

library("ZeligMultilevel") ML.model.0 <- zelig(dv~1 + tag(1|group), model="ls.mixed", data=a.out$imputations) summary(ML.model.0) 

This code produces the following error code:

 Error in object[[1]]$result$call : $ operator not defined for this S4 class 

If I run OLS regression, it works:

 model.0 <- zelig(dv~1, model="ls", data=a.out$imputations) m.0 <- coef(summary(model.0)) print(m.0, digits = 2) Value Std. Error t-stat p-value [1,] 45 0.34 130 2.6e-285 

I am pleased to present a working example. .

 require(Zelig) require(Amelia) require(ZeligMultilevel) data(freetrade) length(freetrade$country) #grouping variable #Imputation of missing data a.out <- amelia(freetrade, m=5, ts="year", cs="country") # Models: (1) OLS; (2) multi-level model.0 <- zelig(polity~1, model="ls", data=a.out$imputations) m.0 <- coef(summary(model.0)) print(m.0, digits = 2) ML.model.0 <- zelig(polity~1 + tag(1|country), model="ls.mixed", data=a.out$imputations) summary(ML.model.0) 

I think the problem is with the way Zelig interacts with the Amelia mi class. So I turned to the alternative package R: lme4.

 require(lme4) write.amelia(obj=a.out, file.stem="inmi", format="csv", na="NA") diff <-list(5) # a list to store each model, 5 is the number of the imputed datasets for (i in 1:5) { file.name <- paste("inmi", 5 ,".csv",sep="") data.to.use <- read.csv(file.name) diff[[5]] <- lmer(polity ~ 1 + (1 | country), data = data.to.use)} diff 

The result is the following:

 [[1]] [1] 5 [[2]] NULL [[3]] NULL [[4]] NULL [[5]] Linear mixed model fit by REML Formula: polity ~ 1 + (1 | country) Data: data.to.use AIC BIC logLik deviance REMLdev 1006 1015 -499.9 1002 999.9 Random effects: Groups Name Variance Std.Dev. country (Intercept) 14.609 3.8222 Residual 17.839 4.2236 Number of obs: 171, groups: country, 9 Fixed effects: Estimate Std. Error t value (Intercept) 2.878 1.314 2.19 

The results remain unchanged when I replace diff[[5]] with diff[[4]] , diff[[3]] , etc. However, I am wondering if these are true results for a combined dataset or for a single set of imputed data. Any thoughts? Thank you

+6
source share
1 answer

I changed the final function for this object (extracted the source and opened the file. /R/summary.R). I added some curly braces to make the code flow and changed getcoef to coef . This should work in this particular case, but I'm not sure if it is generic. The getcoef function searches for the coef3 slot, and I have never seen it. Perhaps @BenBolker can cast an eye here? I can not guarantee that this is a result similar to the result, but the result looks legit for me. Perhaps you can contact the authors of the package to fix this in a future version.

Summary (ML.model.0)

  Model: ls.mixed Number of multiply imputed data sets: 5 Combined results: Call: zelig(formula = polity ~ 1 + tag(1 | country), model = "ls.mixed", data = a.out$imputations) Coefficients: Value Std. Error t-stat p-value [1,] 2.902863 1.311427 2.213515 0.02686218 For combined results from datasets i to j, use summary(x, subset = i:j). For separate results, use print(summary(x), subset = i:j). 

Modified Function:

 summary.MI <- function (object, subset = NULL, ...) { if (length(object) == 0) { stop('Invalid input for "subset"') } else { if (length(object) == 1) { return(summary(object[[1]])) } } # Roman: This function isn't fecthing coefficients robustly. Something goes wrong. Contact package author. getcoef <- function(obj) { # S4 if (!isS4(obj)) { coef(obj) } else { if ("coef3" %in% slotNames(obj)) { obj@coef3 } else { obj@coef } } } # res <- list() # Get indices subset <- if (is.null(subset)) { 1:length(object) } else { c(subset) } # Compute the summary of all objects for (k in subset) { res[[k]] <- summary(object[[k]]) } # Answer ans <- list( zelig = object[[1]]$name, call = object[[1]] $result@call , all = res ) # coef1 <- se1 <- NULL # for (k in subset) { # tmp <- getcoef(res[[k]]) # Roman: I changed this to coef, not 100% sure if the output is the same tmp <- coef(res[[k]]) coef1 <- cbind(coef1, tmp[, 1]) se1 <- cbind(se1, tmp[, 2]) } rows <- nrow(coef1) Q <- apply(coef1, 1, mean) U <- apply(se1^2, 1, mean) B <- apply((coef1-Q)^2, 1, sum)/(length(subset)-1) var <- U+(1+1/length(subset))*B nu <- (length(subset)-1)*(1+U/((1+1/length(subset))*B))^2 coef.table <- matrix(NA, nrow = rows, ncol = 4) dimnames(coef.table) <- list(rownames(coef1), c("Value", "Std. Error", "t-stat", "p-value")) coef.table[,1] <- Q coef.table[,2] <- sqrt(var) coef.table[,3] <- Q/sqrt(var) coef.table[,4] <- pt(abs(Q/sqrt(var)), df=nu, lower.tail=F)*2 ans$coefficients <- coef.table ans$cov.scaled <- ans$cov.unscaled <- NULL for (i in 1:length(ans)) { if (is.numeric(ans[[i]]) && !names(ans)[i] %in% c("coefficients")) { tmp <- NULL for (j in subset) { r <- res[[j]] tmp <- cbind(tmp, r[[pmatch(names(ans)[i], names(res[[j]]))]]) } ans[[i]] <- apply(tmp, 1, mean) } } class(ans) <- "summaryMI" ans } 
+6
source

Source: https://habr.com/ru/post/945091/


All Articles