Equivalent to R cor.test in Python

Is there a way to find the confidence interval r in Python?

In R, I could do something like:

cor.test(m, h)

    Pearson product-moment correlation

data:  m and h
t = 0.8974, df = 4, p-value = 0.4202
alternative hypothesis: true correlation is not equal to 0
95 percent confidence interval:
 -0.6022868  0.9164582
sample estimates:
      cor 
0.4093729

In Python, I can calculate r (cor) using:

r,p = scipy.stats.pearsonr(df.age, df.pets)

But this does not return the confidence interval r.

+4
source share
1 answer

Here is one way to calculate confidence inside

First get the correlation value (pearson's)

In [85]: from scipy import stats

In [86]: corr = stats.pearsonr(df['col1'], df['col2'])

In [87]: corr
Out[87]: (0.551178607008175, 0.0)

Use the Fisher transform to get z

In [88]: z = np.arctanh(corr[0])

In [89]: z
Out[89]: 0.62007264620685021

And, the value of sigma ie standard error

In [90]: sigma = (1/((len(df.index)-3)**0.5))

In [91]: sigma
Out[91]: 0.013840913308956662

Get the normal probability density function of 95% for a normal continuous random variable, use the two-sidedconditional formula

In [92]: cint = z + np.array([-1, 1]) * sigma * stats.norm.ppf((1+0.95)/2)

Finally, take the hyperbolic tangent to get interval values ​​for 95%

In [93]: np.tanh(cint)
Out[93]: array([ 0.53201034,  0.56978224])
+6
source

Source: https://habr.com/ru/post/1589373/


All Articles