Calculate Least Square Confidence Range

Question

Calculate Least Square Confidence Range

I have a question that I am now fighting.

How to calculate the confidence range (95%) of the fit?

Fixing curves to data is the everyday work of every physicist, so I think it should be implemented somewhere, but I can’t find an implementation for this and I don’t know how to do it mathematically.

The only thing I found is seaborn , which does a great job with linear least squares.

 import numpy as np from matplotlib import pyplot as plt import seaborn as sns import pandas as pd x = np.linspace(0,10) y = 3*np.random.randn(50) + x data = {'x':x, 'y':y} frame = pd.DataFrame(data, columns=['x', 'y']) sns.lmplot('x', 'y', frame, ci=95) plt.savefig("confidence_band.pdf")

linear-least-square

But this is just the linear least square. When I want to customize, for example. saturation curve similar saturation-eqn I am screwed.

Of course, I can calculate the t distribution from the std error of the least square method, for example scipy.optimize.curve_fit , but this is not what I am looking for.

Thanks for any help!

+6

python statistics regression confidence-interval

smoneck Nov 25 '14 at 0:00

source share

2 answers

Seyeong jeong · Answer 1 · 2015-02-24T13:22:44+0000

You can easily achieve this using StatsModels .

Also see this example and this answer .

Here is the answer to your question:

 import numpy as np from matplotlib import pyplot as plt import statsmodels.api as sm from statsmodels.stats.outliers_influence import summary_table x = np.linspace(0,10) y = 3*np.random.randn(50) + x X = sm.add_constant(x) res = sm.OLS(y, X).fit() st, data, ss2 = summary_table(res, alpha=0.05) fittedvalues = data[:,2] predict_mean_se = data[:,3] predict_mean_ci_low, predict_mean_ci_upp = data[:,4:6].T predict_ci_low, predict_ci_upp = data[:,6:8].T fig, ax = plt.subplots(figsize=(8,6)) ax.plot(x, y, 'o', label="data") ax.plot(X, fittedvalues, 'r-', label='OLS') ax.plot(X, predict_ci_low, 'b--') ax.plot(X, predict_ci_upp, 'b--') ax.plot(X, predict_mean_ci_low, 'g--') ax.plot(X, predict_mean_ci_upp, 'g--') ax.legend(loc='best'); plt.show()

Ulrich stern · Answer 2 · 2016-05-07T02:25:40+0000

kmpfit confidence_band() computes the confidence range for non-linear least squares. Here for the saturation curve:

 from pylab import * from kapteyn import kmpfit def model(p, x): a, b = p return a*(1-np.exp(b*x)) x = np.linspace(0, 10, 100) y = .1*np.random.randn(x.size) + model([1, -.4], x) fit = kmpfit.simplefit(model, [.1, -.1], x, y) a, b = fit.params dfdp = [1-np.exp(b*x), -a*x*np.exp(b*x)] yhat, upper, lower = fit.confidence_band(x, dfdp, 0.95, model) scatter(x, y, marker='.', color='#0000ba') for i, l in enumerate((upper, lower, yhat)): plot(x, l, c='g' if i == 2 else 'r', lw=2) savefig('kmpfit confidence bands.png', bbox_inches='tight')

dfdp - partial derivatives ∂f / ∂p of the model f = a * (1-e ^ (b * x)) with respect to each parameter p (i.e. a and b), see mine answer a similar question for background links. And here is the conclusion:

Calculate Least Square Confidence Range

More articles: