How can I classify the results of a post-hoc test in R?

I am trying to understand how to work with ANOVA and post-hoc tests in R. So far I have used aov () and TukeyHSD () to analyze my data. Example:

uni2.anova <- aov(Sum_Uni ~ Micro, data= uni2) uni2.anova Call: aov(formula = Sum_Uni ~ Micro, data = uni2) Terms: Micro Residuals Sum of Squares 0.04917262 0.00602925 Deg. of Freedom 15 48 Residual standard error: 0.01120756 Estimated effects may be unbalanced 

My problem is that now I have a huge list of pairwise comparisons, but I can’t do anything about it:

  TukeyHSD(uni2.anova) Tukey multiple comparisons of means 95% family-wise confidence level Fit: aov(formula = Sum_Uni ~ Micro, data = uni2) $Micro diff lwr upr p adj Act_Glu2-Act_Ala2 -0.0180017863 -0.046632157 0.0106285840 0.6448524 Ana_Ala2-Act_Ala2 -0.0250134285 -0.053643799 0.0036169417 0.1493629 NegI_Ala2-Act_Ala2 0.0702274527 0.041597082 0.0988578230 0.0000000 

This dataset contains 40 rows ... Idealy, I would like to get a dataset that looks something like this:

  • Act_Glu2: a
  • Act_Ala2: a
  • NegI_Ala2: b ...

I hope you understand. So far, I have not found anything comparable online ... I also tried to select only significant pairs in the file obtained in TukeyHSD, but the file did not “recognize” that it consists of rows and columns, which makes it impossible to select ..

Perhaps there is something fundamentally wrong with my approach?

+6
source share
3 answers

I think the OP wants the letters to get an idea of ​​the comparison.

 library(multcompView) multcompLetters(extract_p(TukeyHSD(uni2.anova))) 

This will give you letters.

You can also use multcomp package

 library(multcomp) cld(glht(uni2.anova, linct = mcp(Micro = "Tukey"))) 

Hope this is what you need.

+7
source

TukeyHSD Results - This is a list. Use str to look at the structure. In your case, you will see that this is a list of one element, and this element is basically a matrix. So, to extract the first column, you want to save the result of TukeyHSD

 hsd <- TukeyHSD(uni2.anova) 

If you look at str(hsd) , you can then get a bit ...

 hsd$Micro[,1] 

This will give you a column of your differences. Now you can extract what you want.

+2
source

It's hard to say without example data, but assuming that Micro is just a factor with 4 levels, and uni2 looks something like

 n = 40 Micro = c('Act_Glu2', 'Act_Ala2', 'Ana_Ala2', 'NegI_Ala2')[sample(4, 40, rep=T)] Sum_Uni = rnorm(n, 5, 0.5) Sum_Uni[Micro=='Act_Glu2'] = Sum_Uni[Micro=='Act_Glu2'] + 0.5 uni2 = data.frame(Sum_Uni, Micro) 
 > uni2 Sum_Uni Micro 1 4.964061 Ana_Ala2 2 4.807680 Ana_Ala2 3 4.643279 NegI_Ala2 4 4.793383 Act_Ala2 5 5.307951 NegI_Ala2 6 5.171687 Act_Glu2 ... 

then I think that you are actually trying to get the basic output with multiple regression:

 fit = lm(Sum_Uni ~ Micro, data = uni2) summary(fit) anova(fit) 
 > summary(fit) Call: lm(formula = Sum_Uni ~ Micro, data = uni2) Residuals: Min 1Q Median 3Q Max -1.26301 -0.35337 -0.04991 0.29544 1.07887 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 4.8364 0.1659 29.157 < 2e-16 *** MicroAct_Glu2 0.9542 0.2623 3.638 0.000854 *** MicroAna_Ala2 0.1844 0.2194 0.841 0.406143 MicroNegI_Ala2 0.1937 0.2158 0.898 0.375239 --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Residual standard error: 0.4976 on 36 degrees of freedom Multiple R-squared: 0.2891, Adjusted R-squared: 0.2299 F-statistic: 4.88 on 3 and 36 DF, p-value: 0.005996 > anova(fit) Analysis of Variance Table Response: Sum_Uni Df Sum Sq Mean Sq F value Pr(>F) Micro 3 3.6254 1.20847 4.8801 0.005996 ** Residuals 36 8.9148 0.24763 --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 

You can access the numbers in any of these tables, for example,

 > summary(fit)$coef[2,4] [1] 0.0008536287 

To view a list of what is stored in each object, use names() :

 > names(summary(fit)) [1] "call" "terms" "residuals" "coefficients" [5] "aliased" "sigma" "df" "r.squared" [9] "adj.r.squared" "fstatistic" "cov.unscaled" 

In addition to the found TukeyHSD() function, there are many other options for further searching for paired tests and, if desired, correcting p-values. These include pairwise.table() , estimable() in gmodels , resampling and boot packages and others ...

+1
source

Source: https://habr.com/ru/post/900622/


All Articles