I have a hypothetical function y from x and try to find / fit the logarithmic distribution curve that would best shape the data. I use the curve_fit function and can match the normal distribution, but the curve is not optimized.
The data points y and x are shown below, where y = f (x).
y_axis = [0.00032425299473065838, 0.00063714106162861229, 0.00027009331177605913, 0.00096672396877715144, 0.002388766809835889, 0.0042233337680543182, 0.0053072824980722137, 0.0061291327849408699, 0.0064555344006149871, 0.0065601228278316746, 0.0052574034010282218, 0.0057924488798939255, 0.0048154093097913355, 0.0048619350036057446, 0.0048154093097913355, 0.0045114840997070331, 0.0034906838696562147, 0.0040069911024866456, 0.0027766995669134334, 0.0016595801819374015, 0.0012182145074882836, 0.00098231827111984341, 0.00098231827111984363, 0.0012863691645616997, 0.0012395921040321833, 0.00093554121059032721, 0.0012629806342969417, 0.0010057068013846018, 0.0006081017868837127, 0.00032743942370661445, 4.6777060529516312e-05, 7.0165590794274467e-05, 7.0165590794274467e-05, 4.6777060529516745e-05]
the y axis is the probability of the event occurring in the pins along the x axis:
x_axis = [1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0, 10.0, 11.0, 12.0, 13.0, 14.0, 15.0, 16.0, 17.0, 18.0, 19.0, 20.0, 21.0, 22.0, 23.0, 24.0, 25.0, 26.0, 27.0, 28.0, 29.0, 30.0, 31.0, 32.0, 33.0, 34.0]
I was able to better approach my data using the excel and lognormal approach. When I try to use lognormal in python, the fit does not work and I am doing something wrong.
Below is the code that I use for normal distribution, which seems to be the only thing that I can fit in python (hard to believe):
#fitting distributino on top of savitzky-golay %matplotlib inline import matplotlib import matplotlib.pyplot as plt import pandas as pd import scipy import scipy.stats import numpy as np from scipy.stats import gamma, lognorm, halflogistic, foldcauchy from scipy.optimize import curve_fit matplotlib.rcParams['figure.figsize'] = (16.0, 12.0) matplotlib.style.use('ggplot')
I am trying to get answers to two questions:
- Is this the best option I get from a normal distribution curve? How can I adapt to this?
Normal distribution result: 
- How can I match the lognormal distribution of this data, or is there a better distribution that I can use?
I played with a lognormal distribution curve, tuning mu and sigma, it seems to be better suited. I do not understand what I'm doing wrong to get similar results in python.