Failure of a nonlinear fit to a sinusoidal curve

I tried to adjust the amplitude, frequency and phase of the sine curve, given some of the generated two-dimensional toy data. (Code at the end)

To get estimates for three parameters, I first perform an FFT. I use the values ​​from the FFT as initial guesses for the actual frequency and phase, and then for them (line by line). I wrote my code in such a way that I enter in which FFT bin I need the frequency, so I can check if the fitting works. But there is a rather strange behavior. If my input bit is 3.1 (not a built-in bit, so the FFT will not give me the correct frequency), then the fit works fine. But if the input bit is 3 (so the FFT outputs the exact frequency), then my fit failed, and I'm trying to figure out why.

Here's the conclusion when I give input bins (in the X and Y direction) as 3.0 and 2.1 respectively:

(The plot on the right is the data) fig1

Here's the output when I give input beans as 3.0 and 2.0: fig2

Question: Why does non-linear fit fail when entering the exact frequency of the curve?


the code:

#! /usr/bin/python # For the purposes of this code, it easier to think of the XY axes as transposed, # so the X axis is vertical and the Y axis is horizontal import numpy as np import matplotlib.pyplot as plt import scipy.optimize as optimize import itertools import sys PI = np.pi # Function which accepts paramters to define a sin curve # Used for the non linear fit def sineFit(t, a, f, p): return a * np.sin(2.0 * PI * f*t + p) xSize = 18 ySize = 60 npt = xSize * ySize # Get frequency bin from user input xFreq = float(sys.argv[1]) yFreq = float(sys.argv[2]) xPeriod = xSize/xFreq yPeriod = ySize/yFreq # arrays should be defined here # Generate the 2D sine curve for jj in range (0, xSize): for ii in range(0, ySize): sineGen[jj, ii] = np.cos(2.0*PI*(ii/xPeriod + jj/yPeriod)) # Compute 2dim FFT as well as freq bins along each axis fftData = np.fft.fft2(sineGen) fftMean = np.mean(fftData) fftRMS = np.std(fftData) xFreqArr = np.fft.fftfreq(fftData.shape[1]) # Frequency bins along x yFreqArr = np.fft.fftfreq(fftData.shape[0]) # Frequency bins along y # Find peak of FFT, and position of peak maxVal = np.amax(np.abs(fftData)) maxPos = np.where(np.abs(fftData) == maxVal) # Iterate through peaks in the FFT # For this example, number of loops will always be only one prevPhase = -1000 for col, row in itertools.izip(maxPos[0], maxPos[1]): # Initial guesses for fit parameters from FFT init_phase = np.angle(fftData[col,row]) init_amp = 2.0 * maxVal/npt init_freqY = yFreqArr[col] init_freqX = xFreqArr[row] cntr = 0 if prevPhase == -1000: prevPhase = init_phase guess = [init_amp, init_freqX, prevPhase] # Fit each row of the 2D sine curve independently for rr in sineGen: (amp, freq, phs), pcov = optimize.curve_fit(sineFit, xDat, rr, guess) # xDat is an linspace array, containing a list of numbers from 0 to xSize-1 # Subtract fit from original data and plot fitData = sineFit(xDat, amp, freq, phs) sub1 = rr - fitData # Plot fig1 = plt.figure() ax1 = fig1.add_subplot(121) p1, = ax1.plot(rr, 'g') p2, = ax1.plot(fitData, 'b') plt.legend([p1,p2], ["data", "fit"]) ax2 = fig1.add_subplot(122) p3, = ax2.plot(sub1) plt.legend([p3], ['residual1']) fig1.tight_layout() plt.show() cntr += 1 prevPhase = phs # Update guess for phase of sine curve 
+4
source share
5 answers

I tried to translate the important parts of your question into this answer.

  • First of all, try installing a single data block, not an array. Once you are sure that your model is sufficient, you can move on.
  • Your suitability will only be as good as your model if you move on to something that is not “sinusoidal”, as you will need to adjust accordingly.
  • Fixation is an “art," since the initial conditions can significantly change the convergence of the error function. In addition, your seizures may have more than one minimum, so you often have to worry about the uniqueness of the solution you offer.

While you were on the right track with the idea of ​​FFT, I think that your implementation was not entirely correct. The code below should be a great toy system. It generates random data like f(x) = a0*sin(a1*x+a2) . Sometimes a random initial guess will work, sometimes it will be ineffective. However, using the FFT assumption of frequency, convergence should always work for this system. Output Example:

enter image description here

 import numpy as np import pylab as plt import scipy.optimize as optimize # This is your target function def sineFit(t, (a, f, p)): return a * np.sin(2.0*np.pi*f*t + p) # This is our "error" function def err_func(p0, X, Y, target_function): err = ((Y - target_function(X, p0))**2).sum() return err # Try out different parameters, sometimes the random guess works # sometimes it fails. The FFT solution should always work for this problem inital_args = np.random.random(3) X = np.linspace(0, 10, 1000) Y = sineFit(X, inital_args) # Use a random inital guess inital_guess = np.random.random(3) # Fit sol = optimize.fmin(err_func, inital_guess, args=(X,Y,sineFit)) # Plot the fit Y2 = sineFit(X, sol) plt.figure(figsize=(15,10)) plt.subplot(211) plt.title("Random Inital Guess: Final Parameters: %s"%sol) plt.plot(X,Y) plt.plot(X,Y2,'r',alpha=.5,lw=10) # Use an improved "fft" guess for the frequency # this will be the max in k-space timestep = X[1]-X[0] guess_k = np.argmax( np.fft.rfft(Y) ) guess_f = np.fft.fftfreq(X.size, timestep)[guess_k] inital_guess[1] = guess_f # Guess the amplitiude by taking the max of the absolute values inital_guess[0] = np.abs(Y).max() sol = optimize.fmin(err_func, inital_guess, args=(X,Y,sineFit)) Y2 = sineFit(X, sol) plt.subplot(212) plt.title("FFT Guess : Final Parameters: %s"%sol) plt.plot(X,Y) plt.plot(X,Y2,'r',alpha=.5,lw=10) plt.show() 
+3
source

The problem is due to a poor initial assumption about the phase, and not about the frequency. When looping through the genSine lines (inner loop), you use the result of matching the previous line as the starting assumption for the next line, which does not always work. If you determine the phase from fft of the current line and use it as the original guess, it will succeed. You can change the inner loop as follows:

 for n,rr in enumerate(sineGen): fftx = np.fft.fft(rr) fftx = fftx[:len(fftx)/2] idx = np.argmax(np.abs(fftx)) init_phase = np.angle(fftx[idx]) print fftx[idx], init_phase ... 

Also you need to change

 def sineFit(t, a, f, p): return a * np.sin(2.0 * np.pi * f*t + p) 

to

 def sineFit(t, a, f, p): return a * np.cos(2.0 * np.pi * f*t + p) 

since phase = 0 means that the imaginary part fft is equal to zero and, therefore, the function is cosine.

Btw. in your example above, the definitions of sineGen and xDat are still missing.

+1
source

Not understanding most of your code, according to http://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.curve_fit.html :

 (amp2, freq2, phs2), pcov = optimize.curve_fit(sineFit, tDat, sub1, guess2) 

should become:

 (amp2, freq2, phs2), pcov = optimize.curve_fit(sineFit, tDat, sub1, p0=guess2) 

Assuming tDat and sub1 are x and y, this should do the trick. But, again, it’s quite difficult to understand such a complex code with so many related variables and comments in general. The code should always be built from the bottom up, which means that you don’t perform a fit cycle when one of them doesn’t work, you don’t add noise until the code works to correspond to quiet examples ... Good luck!

0
source

In "nothing fancy," I meant something like the removal of TOTAL, which is not related to fitting, and makes a simplified example example, for example:

 import numpy as np import scipy.optimize as optimize def sineFit(t, a, f, p): return a * np.sin(2.0 * np.pi * f*t + p) # Create array of x and y with given parameters x = np.asarray(range(100)) y = sineFit(x, 1, 0.05, 0) # Give a guess and fit, printing result of the fitted values guess = [1., 0.05, 0.] print optimize.curve_fit(sineFit, x, y, guess)[0] 

The result of this is precisely the answer:

 [1. 0.05 0.] 

But if you change the assumption not too much, enough:

 # Give a guess and fit, printing result of the fitted values guess = [1., 0.06, 0.] print optimize.curve_fit(sineFit, x, y, guess)[0] 

the result gives absurdly wrong numbers:

 [ 0.00823701 0.06391323 -1.20382787] 

Can you explain this behavior?

0
source

You can use curve_fit with a number of trigonometric functions, usually very reliable and precision-tuned, that you need, just increasing the number of terms ... here is an example:

 from scipy import sin, cos, linspace def f(x, a0,s1,s2,s3,s4,s5,s6,s7,s8,s9,s10,s11,s12, c1,c2,c3,c4,c5,c6,c7,c8,c9,c10,c11,c12): return a0 + s1*sin(1*x) + c1*cos(1*x) \ + s2*sin(2*x) + c2*cos(2*x) \ + s3*sin(3*x) + c3*cos(3*x) \ + s4*sin(4*x) + c4*cos(4*x) \ + s5*sin(5*x) + c5*cos(5*x) \ + s6*sin(6*x) + c6*cos(6*x) \ + s7*sin(7*x) + c7*cos(7*x) \ + s8*sin(8*x) + c8*cos(8*x) \ + s9*sin(9*x) + c9*cos(9*x) \ + s10*sin(9*x) + c10*cos(9*x) \ + s11*sin(9*x) + c11*cos(9*x) \ + s12*sin(9*x) + c12*cos(9*x) from scipy.optimize import curve_fit pi/2. / (x.max() - x.min()) x_norm *= norm_factor popt, pcov = curve_fit(f, x_norm, y) x_fit = linspace(x_norm.min(), x_norm.max(), 1000) y_fit = f(x_fit, *popt) plt.plot( x_fit/x_norm, y_fit ) 
0
source

Source: https://habr.com/ru/post/1495323/


All Articles