I am new to R and recently started using it to analyze microchip data. The overall goal of the analysis is to take DC2 and compare the WT versus KO groups in this population. But I ran into some problems with limma handling. After processing the data using the oligo package, I tried to create a design matrix for analysis using limma. This is my workflow for ExpressionSet DC2:
pData(DC2)
index filename genotype cell_type
1 KO DC2 2 HP10.CEL KO DC2
2 KO DC2 3 HP11.CEL KO DC2
3 KO DC2 4 HP12.CEL KO DC2
1 WT DC2 10 HP7.CEL WT DC2
2 WT DC2 11 HP8.CEL WT DC2
3 WT DC2 12 HP9.CEL WT DC2
design <- model.matrix(~DC2$genotype)
design
(Intercept) DC2$genotypeWT
1 1 0
2 1 0
3 1 0
4 1 1
5 1 1
6 1 1
fit <- lmFit(DC2, design)
fit <- eBayes(fit)
toptable(fit)
This lists the genes as follows:
logFC t P.Value adj.P.Val B
17551163 14.09722 208.2627 2.990326e-13 2.700912e-10 17.14467
17511316 13.91167 205.0811 3.292503e-13 2.700912e-10 17.12716
17551167 13.92093 204.5801 3.343243e-13 2.700912e-10 17.12434
17375373 13.76320 202.1271 3.605170e-13 2.700912e-10 17.11025
17550685 13.74022 201.5428 3.671032e-13 2.700912e-10 17.10682
However, when I go to check this manually (just using the first function) using this code:
toptable(fit, n=1)
genename <- rownames(toptable(fit, n=1))
typeMean <- tapply(exprs(DC2)[genename,], DC2$genotype, mean)
typeMean["KO"] - typeMean["WT"]
The output for the same function "17551163" is different
KO
0.04538317
, . , ? .