R: unexpected results p.adjust (FDR)

This problem arose when I tried to use p.adjust to start setting the FDR for the vector of p-values. The problem was that many of the obtained p-values ​​were identical. I thought it might be some kind of quirk with my data, but I reproduced the same problem with arbitrary input vectors. For instance:

 temp <- runif(40, min = 0, max = 1) temp # [1] 0.81563482 0.17471363 0.74697979 0.88755248 0.69676260 0.58207008 # [7] 0.87138747 0.76432908 0.64352955 0.06501659 0.70680646 0.81557281 #[13] 0.58480274 0.19476004 0.01035137 0.46119840 0.17783440 0.71828874 #[19] 0.30978182 0.76902202 0.01615609 0.93122152 0.37936458 0.52369562 #[25] 0.90997054 0.30098383 0.70873206 0.71159740 0.38148526 0.78415579 #[31] 0.64605509 0.18898361 0.76770495 0.40651004 0.42255944 0.68395505 #[37] 0.51844368 0.25855720 0.12090991 0.50110836 p.adjust(temp, method="fdr") # [1] 0.9062609 0.9062609 0.9062609 0.9312215 0.9062609 0.9062609 0.9312215 # [8] 0.9062609 0.9062609 0.8668878 0.9062609 0.9062609 0.9062609 0.9062609 #[15] 0.3231218 0.9062609 0.9062609 0.9062609 0.9062609 0.9062609 0.3231218 #[22] 0.9312215 0.9062609 0.9062609 0.9312215 0.9062609 0.9062609 0.9062609 #[29] 0.9062609 0.9062609 0.9062609 0.9062609 0.9062609 0.9062609 0.9062609 #[36] 0.9062609 0.9062609 0.9062609 0.9062609 0.9062609 

I would like to think that I'm wrong. I do not think it is reasonable for FDR to adjust p values ​​for all the same values. Most of the data is identical and not all correspond to the original p-value. Any ideas what is wrong here? Thanks.

+6
source share
1 answer

The method works fine. You just do not understand what the algorithm does completely. I would advise reading a little about the procedure (Wikipedia Link) .

 n <- 40 p <- runif(n, 0, 1) adjusted.p <- p.adjust(p, "fdr") #same as p.adjust(p, "BH") ## Let recreate. Going back to the paper we ## can construct a q-value by sorting the p-values ## and then applying the transformation: ## q_i = p_i*n/i ## Where p_i is the ith smallest p-value ## Sort the p-values and keep track of the order id <- order(p) tmp <- p[id] (q <- tmp*n/(1:n)) # [1] 2.0322183 1.0993436 1.2357457 1.1113084 0.9282536 0.7877181 0.7348436 # [8] 0.7204077 0.6587102 0.6289974 0.9205222 0.8827835 0.8600234 0.8680704 #[15] 1.1532693 1.1055951 1.0451330 1.1314681 1.1167659 1.1453142 1.1012847 #[22] 1.0783747 1.0672471 1.0500311 1.0389407 1.0089661 1.0117881 0.9917817 #[29] 0.9892826 0.9668636 0.9844052 0.9792733 1.0000320 0.9967073 0.9707149 #[36] 0.9747723 0.9968153 0.9813367 1.0058445 0.9942353 ## However there is one more portion to the adjustment ## We don't want a p-value of .02 getting ## a larger q-value than a p-value of .05 ## so we make sure that if a smaller q-value ## shows up in the list we set all of ## the q-values corresponding to smaller p-values ## to that corresponding q-value. ## We can do this by reversing the list ## Keeping a running tally of the minimum value ## and then re-reversing new.q <- rev(cummin(rev(q))) ## Put it back in the original order new.q[order(id)] # [1] 0.6289974 0.9668636 0.6289974 0.6289974 0.6289974 0.9707149 0.9707149 # [8] 0.8600234 0.9668636 0.6289974 0.9707149 0.9668636 0.9668636 0.9813367 #[15] 0.9668636 0.9668636 0.9668636 0.9707149 0.9668636 0.9668636 0.6289974 #[22] 0.9942353 0.8600234 0.6289974 0.9668636 0.6289974 0.6289974 0.8680704 #[29] 0.9668636 0.9668636 0.9668636 0.9942353 0.9707149 0.9813367 0.9668636 #[36] 0.8600234 0.6289974 0.9668636 0.9668636 0.9747723 ## This should match the adjustment adjusted.p # [1] 0.6289974 0.9668636 0.6289974 0.6289974 0.6289974 0.9707149 0.9707149 # [8] 0.8600234 0.9668636 0.6289974 0.9707149 0.9668636 0.9668636 0.9813367 #[15] 0.9668636 0.9668636 0.9668636 0.9707149 0.9668636 0.9668636 0.6289974 #[22] 0.9942353 0.8600234 0.6289974 0.9668636 0.6289974 0.6289974 0.8680704 #[29] 0.9668636 0.9668636 0.9668636 0.9942353 0.9707149 0.9813367 0.9668636 #[36] 0.8600234 0.6289974 0.9668636 0.9668636 0.9747723 
+14
source

Source: https://habr.com/ru/post/914168/


All Articles