Qui-square analysis using for loop in R

I am trying to do a chi-analysis of squares for all combinations of variables in the data, and my code is:

Data <- esoph[ , 1:3] OldStatistic <- NA for(i in 1:(ncol(Data)-1)){ for(j in (i+1):ncol(Data)){ Statistic <- data.frame("Row"=colnames(Data)[i], "Column"=colnames(Data)[j], "Chi.Square"=round(chisq.test(Data[ ,i], Data[ ,j])$statistic, 3), "df"=chisq.test(Data[ ,i], Data[ ,j])$parameter, "p.value"=round(chisq.test(Data[ ,i], Data[ ,j])$p.value, 3), row.names=NULL) temp <- rbind(OldStatistic, Statistic) OldStatistic <- Statistic Statistic <- temp } } str(Data) 'data.frame': 88 obs. of 3 variables: $ agegp: Ord.factor w/ 6 levels "25-34"<"35-44"<..: 1 1 1 1 1 1 1 1 1 1 ... $ alcgp: Ord.factor w/ 4 levels "0-39g/day"<"40-79"<..: 1 1 1 1 2 2 2 2 3 3 ... $ tobgp: Ord.factor w/ 4 levels "0-9g/day"<"10-19"<..: 1 2 3 4 1 2 3 4 1 2 ... Statistic Row Column Chi.Square df p.value 1 agegp tobgp 2.400 15 1 2 alcgp tobgp 0.619 9 1 

My code gives my square-square analysis for variable 1 vs variable 3 and variable 2 vs variable 3 and is missing for variable 1 vs variable 2. I tried, but could not fix the code. Any comments and suggestions would be greatly appreciated. I would like to make a crosstab for all possible combinations. Thanks in advance.

EDIT

I used this analysis in SPSS, but now I want to switch to R.

+7
source share
2 answers

A sample of your data will be appreciated, but I think it will work for you. First create a combination of all columns with combn . Then write a function that will be used with the apply function to iterate through combos. I like to use plyr as it is easy to specify what you want for the data structure on the back panel. Also note that you only need to calculate the square-square test once for each combination of columns, which should also speed up the process.

 library(plyr) combos <- combn(ncol(Dat),2) adply(combos, 2, function(x) { test <- chisq.test(Dat[, x[1]], Dat[, x[2]]) out <- data.frame("Row" = colnames(Dat)[x[1]] , "Column" = colnames(Dat[x[2]]) , "Chi.Square" = round(test$statistic,3) , "df"= test$parameter , "p.value" = round(test$p.value, 3) ) return(out) }) 
+17
source

I wrote my own function. It creates a matrix in which all nominal variables are checked against each other. It can also save the results as an excel file. It displays all pvalues ​​that are less than 5%.

 funMassChi <- function (x,delFirst=0,xlsxpath=FALSE) { options(scipen = 999) start <- (delFirst+1) ds <- x[,start:ncol(x)] cATeND <- ncol(ds) catID <- 1:cATeND resMat <- ds[1:cATeND,1:(cATeND-1)] resMat[,] <- NA for(nCc in 1:(length(catID)-1)){ for(nDc in (nCc+1):length(catID)){ tryCatch({ chiRes <- chisq.test(ds[,catID[nCc]],ds[,catID[nDc]]) resMat[nDc,nCc]<- chiRes[[3]] }, error=function(e){cat(paste("ERROR :","at",nCc,nDc, sep=" "),conditionMessage(e), "\n")}) } } resMat[resMat > 0.05] <- "" Ergebnis <- cbind(CatNames=names(ds),resMat) Ergebnis <<- Ergebnis[-1,] if (!(xlsxpath==FALSE)) { write.xlsx(x = Ergebnis, file = paste(xlsxpath,"ALLChi-",Sys.Date(),".xlsx",sep=""), sheetName = "Tabelle1", row.names = FALSE) } } funMassChi(categorialDATA,delFirst=3,xlsxpath="C:/folder1/folder2/") 

delFirst can delete the first n columns. Therefore, if you have an account index or something that you do not want to test.

I hope this can help anyone.

0
source

Source: https://habr.com/ru/post/897119/


All Articles