R: t-test for all columns

Question

R: t-test for all columns

I tried to perform a t-test on all columns (two at a time) of my data frame and extract only the p-value. Here is what I came up with:

for (i in c(5:525) ) { t_test_p.value =sapply( Data[5:525], function(x) t.test(Data[,i],x, na.rm=TRUE)$p.value) }

My questions: 1. Is there a way to do this without a loop? 2. How to get t-test results.

+4

r

ery Mar 12 '12 at 3:39

source share

5 answers

I would recommend converting your data frame to a long format and use pairwise.t.test with the corresponding p.adjust :

 > library(reshape2) > > df <- data.frame(a=runif(100), + b=runif(100), + c=runif(100)+0.5, + d=runif(100)+0.5, + e=runif(100)+1, + f=runif(100)+1) > > d <- melt(df) Using as id variables > > pairwise.t.test(d$value, d$variable, p.adjust = "none") Pairwise comparisons using t tests with pooled SD data: d$value and d$variable abcdeb 0.86 - - - - c <2e-16 <2e-16 - - - d <2e-16 <2e-16 0.73 - - e <2e-16 <2e-16 <2e-16 <2e-16 - f <2e-16 <2e-16 <2e-16 <2e-16 0.63 P value adjustment method: none > pairwise.t.test(d$value, d$variable, p.adjust = "bon") Pairwise comparisons using t tests with pooled SD data: d$value and d$variable abcde b 1 - - - - c <2e-16 <2e-16 - - - d <2e-16 <2e-16 1 - - e <2e-16 <2e-16 <2e-16 <2e-16 - f <2e-16 <2e-16 <2e-16 <2e-16 1 P value adjustment method: bonferroni

+15

kohske Mar 12 '12 at 7:45

source share

Here is another solution: outer .

 outer( 1:ncol(Data), 1:ncol(Data), Vectorize( function (i,j) t.test(Data[,i], Data[,j])$p.value ) )

+4

Vincent zoonekynd Mar 12 '12 at 5:38

source share

Assuming your data frame looks something like this:

 df = data.frame(a=runif(100), b=runif(100), c=runif(100), d=runif(100), e=runif(100), f=runif(100))

follows the following

 tests = lapply(seq(1,length(df),by=2),function(x){t.test(df[,x],df[,x+1])})

will give you tests for each set of columns. Note that this will only give you t.test for a and b, c and d, and e and f. if you want a and b, b and c, c and d, d and e, and e and f, you will need:

 tests = lapply(seq(1,(length(df)-1)),function(x){t.test(df[,x],df[,x+1])})

Finally, if you say that you want only the P values from your tests, you can do this:

 pvals = sapply(tests, function(x){x$p.value})

If you don’t know how to work with the object, try entering a summary (tests) and str (tests [[1]]) - in this case test is a list of htest objects, and you want to know the structure of the htest object, not necessarily a list.

Hope this helps!

+2

Davy kavanagh Mar 12 '12 at 4:01

source share

I run this:

 tres<-apply(x,1,t.test) pval<-vapply(tres, "[[", 0, i = "p.value")

It took me a while to guess the “vapply” trick to get the pvals out of the list of t.test result objects. (I edited this from 'sapply' due to Henrik's comment below)

If this is a paired t-test, you can simply subtract and test the tool = 0, which gives exactly the same result (that all paired t.test are):

 tres<-apply(yx,1,t.test) pval<-vapply(tres, "[[", 0, i = "p.value")

Again, this is a t-test of each row for all columns.

0

Erik aronesty 21 sept '12 at 20:00

source share

MYaseen208 · Accepted Answer · 2012-03-12T04:01:32+0000

Try this one

 X <- rnorm(n=50, mean = 10, sd = 5) Y <- rnorm(n=50, mean = 15, sd = 6) Z <- rnorm(n=50, mean = 20, sd = 5) Data <- data.frame(X, Y, Z) library(plyr) combos <- combn(ncol(Data),2) adply(combos, 2, function(x) { test <- t.test(Data[, x[1]], Data[, x[2]]) out <- data.frame("var1" = colnames(Data)[x[1]] , "var2" = colnames(Data[x[2]]) , "t.value" = sprintf("%.3f", test$statistic) , "df"= test$parameter , "p.value" = sprintf("%.3f", test$p.value) ) return(out) }) X1 var1 var2 t.value df p.value 1 1 XY -5.598 92.74744 0.000 2 2 XZ -9.361 90.12561 0.000 3 3 YZ -3.601 97.62511 0.000

R: t-test for all columns

More articles: