Am I missing something obvious or is Matlab kstest2giving very bad p values? By very poor, I mean that I have a suspicion that it is even mistakenly implemented.
The help page kstest2states that the function calculates the asymptotic p-value, although I did not find a reference to which method to the point. In any case, the description further states:
the asymptotic p-value becomes very accurate for large sample sizes and is considered sufficiently accurate for sample sizes n1 and n2, so that (n1 * n2) / (n1 + n2) ≥ 4
Example 1
Let us take Example 6 from Lehman and D'Abrera (1975):
sampleA = [6.8, 3.1, 5.8, 4.5, 3.3, 4.7, 4.2, 4.9];
sampleB = [4.4, 2.5, 2.8, 2.1, 6.6, 0.0, 4.8, 2.3];
[h,p,ks2stat] = kstest2(sampleA, sampleB, 'Tail', 'unequal');
(n1*n2)/(n1 + n2) = 4 in this case, therefore, the p-value should be reasonably accurate.
Matlab p = 0.0497, , , 0.0870.
, R, , Matlab, .
ks.test stats ks.boot Matching:
ks.test(sampleA, sampleB, alternative = "two.sided")
ks.boot(sampleA, sampleB, alternative = "two.sided")
p = 0.0870.
2
kstest2 Matlab R :
rng(1); % For reproducibility
x1 = wblrnd(1,1,1,50);
x2 = wblrnd(1.2,2,1,50);
[h,p,ks2stat] = kstest2(x1,x2);
p = 0.0317. , x1 x2 R, p = 0.03968.
20%, (n1*n2)/(n1 + n2) = 25.
, - ?
, Matlab kstest2 , ? kstest2? ( kstest2, , .)
Matlab 2016a.
Lehman and D'Abrera (1975). : . 1- . Springer.