Algorithm for creating polynomial correspondence of a part of a data set

I have a problem with the algorithm. I don't know if stackoverflow is the right place to post, but since I use Matlab and want to do this with this, I post it there. My problem is this: I have a data set, and I know little about it, except for the fact that at the end of this set the points should be pretty linear. I want to make a linear fit of these points that are linearly distributed without using a part that is not.

(the image is always better understood): enter image description here

As you can see, I have blue data that is not linear, but at the end there is a linear part (red part). I would like to find an algorithm that lets me know when the behavior of the data curve ends its linearity.

I do not know if I am clear?

I tried to make a few points on the right and make a linear fit of these few points. Then add a few points to the list and see if they are close enough to the linear fit. Then make a linear landing again with the added points and so on, but I think this is not the best solution, because the "first" points have a lot of noise (which is not shown here) ...

Do you have any ideas or suggestions or links?

Thanks!

+6
source share
4 answers

What I would like is to find an algorithm that lets me know when the behavior of the data curve ends its linearity.

Linear data have a particularly nice property, it has a constant slope. The second derivative of the linear section should be approximately equal to zero.

Use a spline fit (with some smoothing if the data is noisy) to get a continuous version of your data, name it g(x) . When g''(x) ~ 0 , i.e. When the second derivative is small, it is a linear section.

+4
source

Take the data using the x-position, and then set some restriction for linearity.

  • Start at one end.
  • Check the pearson correlation coefficient of the next predefined part of the graph
  • If above a certain threshold add the included part of x to your range, otherwise stop there

Alternatively, you can perform the linearity test that is best suited for a number of polynomial fittings. For this, I would:

  • Define some common functions of order 1-n, where n is quite small (maybe 3)
  • Add data points to a linear test suite
  • Compare the smallest squared values โ€‹โ€‹of your n functions
  • If the linear value has the smallest square value or is at some distance from the minimum of your n-function, continue to add points. Otherwise, stop and say that the function was linear until the last addition.

These are at least fairly easy ways to do this, and in my Occam razor skill they also have the lowest complexity (n * curve-fit complexity in both cases, although the second has a large constant.), Although it is very possible that there there are more complex complexity algorithms.

+1
source

One way is to approach it with a 2-polynomial number with a large number of points from right to left and observe the third coefficient. Since it remains small enough, the distribution is also fairly linear.

The fact is that it is difficult to establish "sufficiently small" in numbers, except empirically.

Another way could be to compare the linear approximation with real data. In the same way as adding points from right to left, the standard deviation of the approximation approximation. Once this is satisfactory, the approximation is good and the data can be considered linear.

And this is slightly better, because rejection is a fairly transparent concept.

+1
source

If you have piecewise linear behavior with abrupt transitions, you can try fitting the shape

 E[Y] = b0 + b1 * x + b2 * I + b3 * x * I 

where I am the indicator function which is 1 when some condition is met, and 0 otherwise. For your example, the condition could be x > 0 . Coefficient b2 will capture vertical displacement if two segments are parallel, and member b3 is an โ€œinteractionโ€ that captures changes in the slope on both sides of the indicator's control point.

If the transition is more gradual, as you drew it, I agree with @ A.Webb's comment on logistics with the trend.

0
source

Source: https://habr.com/ru/post/949227/


All Articles