You speak
I have a sample data whose logarithm follows a normal distribution.
Suppose data is an array containing samples. To match this data with a scipy.stats.lognorm distribution using scipy.stats.lognorm , use:
s, loc, scale = stats.lognorm.fit(data, floc=0)
Now suppose mu and sigma are the mean and standard deviation of the underlying normal distribution. To get an estimate of these values ββfrom this fit, use:
estimated_mu = np.log(scale) estimated_sigma = s
(These are not estimates of the mean and standard deviation of the samples in data . See the wikipedia page for formulas for the mean and variance of the log-normal distribution in terms of mu and sigma.)
To combine the histogram and the PDF, you can use, for example,
import matplotlib.pyplot as plt. plt.hist(data, bins=50, normed=True, color='c', alpha=0.75) xmin = data.min() xmax = data.max() x = np.linspace(xmin, xmax, 100) pdf = stats.lognorm.pdf(x, s, scale=scale) plt.plot(x, pdf, 'k')
If you want to view the data log, you can do something like the following. Please note that the normal distribution PDF is used here.
logdata = np.log(data) plt.hist(logdata, bins=40, normed=True, color='c', alpha=0.75) xmin = logdata.min() xmax = logdata.max() x = np.linspace(xmin, xmax, 100) pdf = stats.norm.pdf(x, loc=estimated_mu, scale=estimated_sigma) plt.plot(x, pdf, 'k')
By the way, an alternative to fitting with stats.lognorm is to set log(data) using stats.norm.fit :
logdata = np.log(data) estimated_mu, estimated_sigma = stats.norm.fit(logdata)
Related questions: