Here's an abridged version of the shasan code that computes a confidence interval of 95% of the average value of array a :
import numpy as np, scipy.stats as st st.t.interval(0.95, len(a)-1, loc=np.mean(a), scale=st.sem(a))
But using StatsModels tconfint_mean is perhaps even better:
import statsmodels.stats.api as sms sms.DescrStatsW(a).tconfint_mean()
The initial assumptions for both are that the sample (array a ) was compiled independently of the normal distribution with an unknown standard deviation (see MathWorld or Wikipedia ).
For a large sample size n, the average value of the sample is usually distributed, and its confidence interval can be calculated using st.norm.interval() (as indicated in the comment by Jaime). But the above solutions are also true for small n, where st.norm.interval() gives too narrow confidence intervals (ie, "Fake Confidence"). See My answer to a similar question for more information (and one of Russ's comments here).
Here is an example where the correct parameters give (essentially) identical confidence intervals:
In [9]: a = range(10,14) In [10]: mean_confidence_interval(a) Out[10]: (11.5, 9.4457397432391215, 13.554260256760879) In [11]: st.t.interval(0.95, len(a)-1, loc=np.mean(a), scale=st.sem(a)) Out[11]: (9.4457397432391215, 13.554260256760879) In [12]: sms.DescrStatsW(a).tconfint_mean() Out[12]: (9.4457397432391197, 13.55426025676088)
And finally, the wrong result using st.norm.interval() :
In [13]: st.norm.interval(0.95, loc=np.mean(a), scale=st.sem(a)) Out[13]: (10.23484868811834, 12.76515131188166)
Ulrich Stern Dec 26 '15 at 18:56 2015-12-26 18:56
source share