R: t-test for all columns

I tried to perform a t-test on all columns (two at a time) of my data frame and extract only the p-value. Here is what I came up with:

for (i in c(5:525) ) { t_test_p.value =sapply( Data[5:525], function(x) t.test(Data[,i],x, na.rm=TRUE)$p.value) } 

My questions: 1. Is there a way to do this without a loop? 2. How to get t-test results.

+4
source share
5 answers

Try this one

 X <- rnorm(n=50, mean = 10, sd = 5) Y <- rnorm(n=50, mean = 15, sd = 6) Z <- rnorm(n=50, mean = 20, sd = 5) Data <- data.frame(X, Y, Z) library(plyr) combos <- combn(ncol(Data),2) adply(combos, 2, function(x) { test <- t.test(Data[, x[1]], Data[, x[2]]) out <- data.frame("var1" = colnames(Data)[x[1]] , "var2" = colnames(Data[x[2]]) , "t.value" = sprintf("%.3f", test$statistic) , "df"= test$parameter , "p.value" = sprintf("%.3f", test$p.value) ) return(out) }) X1 var1 var2 t.value df p.value 1 1 XY -5.598 92.74744 0.000 2 2 XZ -9.361 90.12561 0.000 3 3 YZ -3.601 97.62511 0.000 
+13
source

I would recommend converting your data frame to a long format and use pairwise.t.test with the corresponding p.adjust :

 > library(reshape2) > > df <- data.frame(a=runif(100), + b=runif(100), + c=runif(100)+0.5, + d=runif(100)+0.5, + e=runif(100)+1, + f=runif(100)+1) > > d <- melt(df) Using as id variables > > pairwise.t.test(d$value, d$variable, p.adjust = "none") Pairwise comparisons using t tests with pooled SD data: d$value and d$variable abcdeb 0.86 - - - - c <2e-16 <2e-16 - - - d <2e-16 <2e-16 0.73 - - e <2e-16 <2e-16 <2e-16 <2e-16 - f <2e-16 <2e-16 <2e-16 <2e-16 0.63 P value adjustment method: none > pairwise.t.test(d$value, d$variable, p.adjust = "bon") Pairwise comparisons using t tests with pooled SD data: d$value and d$variable abcde b 1 - - - - c <2e-16 <2e-16 - - - d <2e-16 <2e-16 1 - - e <2e-16 <2e-16 <2e-16 <2e-16 - f <2e-16 <2e-16 <2e-16 <2e-16 1 P value adjustment method: bonferroni 
+15
source

Here is another solution: outer .

 outer( 1:ncol(Data), 1:ncol(Data), Vectorize( function (i,j) t.test(Data[,i], Data[,j])$p.value ) ) 
+4
source

Assuming your data frame looks something like this:

 df = data.frame(a=runif(100), b=runif(100), c=runif(100), d=runif(100), e=runif(100), f=runif(100)) 

follows the following

 tests = lapply(seq(1,length(df),by=2),function(x){t.test(df[,x],df[,x+1])}) 

will give you tests for each set of columns. Note that this will only give you t.test for a and b, c and d, and e and f. if you want a and b, b and c, c and d, d and e, and e and f, you will need:

 tests = lapply(seq(1,(length(df)-1)),function(x){t.test(df[,x],df[,x+1])}) 

Finally, if you say that you want only the P values ​​from your tests, you can do this:

 pvals = sapply(tests, function(x){x$p.value}) 

If you don’t know how to work with the object, try entering a summary (tests) and str (tests [[1]]) - in this case test is a list of htest objects, and you want to know the structure of the htest object, not necessarily a list.

Hope this helps!

+2
source

I run this:

 tres<-apply(x,1,t.test) pval<-vapply(tres, "[[", 0, i = "p.value") 

It took me a while to guess the β€œvapply” trick to get the pvals out of the list of t.test result objects. (I edited this from 'sapply' due to Henrik's comment below)

If this is a paired t-test, you can simply subtract and test the tool = 0, which gives exactly the same result (that all paired t.test are):

 tres<-apply(yx,1,t.test) pval<-vapply(tres, "[[", 0, i = "p.value") 

Again, this is a t-test of each row for all columns.

0
source

Source: https://habr.com/ru/post/1400925/


All Articles