Curve smoothing algorithm while keeping constant under it

Question

Curve smoothing algorithm while keeping constant under it

Consider the discrete curve defined by the points (x1,y1), (x2,y2), (x3,y3), ... ,(xn,yn)

Define the constant SUM = y1+y2+y3+...+yn . Let's say we change the value of some k number of y points (increase or decrease), so that the total amount of these changed points is less than or equal to the constant SUM .

What would be the best way to configure other y-points under the following two conditions:

The total sum of y-points (y1 '+ y2' + ... + yn ') must remain constant, i.e. SUM .
The curve should retain as much of its original shape as possible.

A simple solution would be to define some delta as follows:

 delta = (ym1' + ym2' + ym3' + ... + ymk') - (ym1 + ym2 + ym3 + ... + ymk')

and distribute this delta to the rest of the items equally. Here ym1' is the value of the changed point after modification, and ym1 is the value of the modified point before modification, to give delta as the total difference in the modification.

However, this would not provide a completely smoothed curve, since the area near the changed points would appear to be torn off. Is there a better solution / algorithm for this problem?

+5

math algorithm curve-fitting

Sohaib Dec 04 '14 at 4:56

source share

3 answers

This is a limited optimization problem. The functional to be minimized is the integral difference between the original curve and the modified curve. The limitations are the area under the curve and the new locations of the changed points. Such codes are not easy to write yourself. It is better to use some open source optimization codes, for example: ool .

+2

fang Dec 04 '14 at 7:37

source share

How to keep the same dynamic range?

calculate the original min0,max0 y min0,max0
smooth y values
calculate new min1,max1 y values
linear interpolation of all values according to the original min max y
```
 y=min1+(y-min1)*(max0-min0)/(max1-min1) 
```

i.e

Not sure about the area, but this should keep the shape much closer to the original. I have this idea right now when you read your question, and now I am faced with a similar problem, so I'm trying to code it and try now +1 anyway to get this idea :)

You can adapt it and combine with the area

So, before that, calculate the area and apply # 1 .. # 4 and after that calculate the new area. Then multiply all values by old_area/new_area . If you have negative values, rather than calculating the absolute area, you should process the positive and negative areas separately and find the multiplication factor in order to best select the initial area for the stand immediately.

[edit1] some results for constant dynamic range

As you can see, the shape shifts slightly to the left. Each image after applying several hundred smooth operations. I am thinking of subdividing at local minimum intervals to improve this ...

[edit2] completed the filter for my own purposes

 void advanced_smooth(double *p,int n) { int i,j,i0,i1; double a0,a1,b0,b1,dp,w0,w1; double *p0,*p1,*w; int *q; if (n<3) return; p0=new double[n<<2]; if (p0==NULL) return; p1=p0+n; w =p1+n; q =(int*)((double*)(w+n)); // compute original min,max for (a0=p[0],i=0;i<n;i++) if (a0>p[i]) a0=p[i]; for (a1=p[0],i=0;i<n;i++) if (a1<p[i]) a1=p[i]; for (i=0;i<n;i++) p0[i]=p[i]; // store original values for range restoration // compute local min max positions to p1[] dp=0.01*(a1-a0); // min delta treshold // compute first derivation p1[0]=0.0; for (i=1;i<n;i++) p1[i]=p[i]-p[i-1]; for (i=1;i<n-1;i++) // eliminate glitches if (p1[i]*p1[i-1]<0.0) if (p1[i]*p1[i+1]<0.0) if (fabs(p1[i])<=dp) p1[i]=0.5*(p1[i-1]+p1[i+1]); for (i0=1;i0;) // remove zeros from derivation for (i0=0,i=0;i<n;i++) if (fabs(p1[i])<dp) { if ((i> 0)&&(fabs(p1[i-1])>=dp)) { i0=1; p1[i]=p1[i-1]; } else if ((i<n-1)&&(fabs(p1[i+1])>=dp)) { i0=1; p1[i]=p1[i+1]; } } // find local min,max to q[] q[n-2]=0; q[n-1]=0; for (i=1;i<n-1;i++) if (p1[i]*p1[i-1]<0.0) q[i-1]=1; else q[i-1]=0; for (i=0;i<n;i++) // set sign as +max,-min if ((q[i])&&(p1[i]<-dp)) q[i]=-q[i]; // this shifts smooth curve to the left !!! // compute weights for (i0=0,i1=1;i1<n;i0=i1,i1++) // loop through all local min,max intervals { for (;(!q[i1])&&(i1<n-1);i1++); // <i0,i1> b0=0.5*(p[i0]+p[i1]); b1=fabs(p[i1]-p[i0]); if (b1>=1e-6) for (b1=0.35/b1,i=i0;i<=i1;i++) // compute weights bigger near local min max w[i]=0.8+(fabs(p[i]-b0)*b1); } // smooth few times for (j=0;j<5;j++) { for (i=0;i<n ;i++) p1[i]=p[i]; // store data to avoid shifting by using half filtered data for (i=1;i<n-1;i++) // FIR smooth filter { w0=w[i]; w1=(1.0-w0)*0.5; p[i]=(w1*p1[i-1])+(w0*p1[i])+(w1*p1[i+1]); } for (i=1;i<n-1;i++) // avoid local min,max shifting too much { if (q[i]>0) // local max { if (p[i]<p[i-1]) p[i]=p[i-1]; // can not be lower then neigbours if (p[i]<p[i+1]) p[i]=p[i+1]; } if (q[i]<0) // local min { if (p[i]>p[i-1]) p[i]=p[i-1]; // can not be higher then neigbours if (p[i]>p[i+1]) p[i]=p[i+1]; } } } for (i0=0,i1=1;i1<n;i0=i1,i1++) // loop through all local min,max intervals { for (;(!q[i1])&&(i1<n-1);i1++); // <i0,i1> // restore original local min,max a0=p0[i0]; b0=p[i0]; a1=p0[i1]; b1=p[i1]; if (a0>a1) { dp=a0; a0=a1; a1=dp; dp=b0; b0=b1; b1=dp; } b1-=b0; if (b1>=1e-6) for (dp=(a1-a0)/b1,i=i0;i<=i1;i++) p[i]=a0+((p[i]-b0)*dp); } delete[] p0; }

therefore p[n] is the input / output data. There are several things that can be configured as follows:

calculation of weights (constants 0.8 and 0.35 means that the scales are <0.8,0.8+0.35/2> )
number of smooth passes (now 5 in the for loop)
The more weight, the less filtering 1.0 means no change

Main idea:

find local extremes
calculate weights for smoothing
so near the local extremes there is almost no change in the output file
smooth
restore dynamic range for each interval between all local extreme values

[Note]

I also tried to restore the area, but this is incompatible with my task, because it greatly distorts the shape. Therefore, if you really need an area, then focus on it, not on form. Smoothing leads to the fact that the signal is compressed mainly, therefore, after the restoration of the region, the value increases in magnitude.

The actual state of the filter does not have a noticeable shift towards the form (this was the main goal for me). Some images for a more bumpy signal (the original filter was extremely bad at that):

As you can see, there is no visible change in waveform. Local extremes tend to create sharp spikes after very strong smoothing, but it was expected

Hope this helps ...

+1

Spektre Dec 04 '14 at 7:29

source share

dmuir · Accepted Answer · 2014-12-04T15:21:38+0000

I used the following approach, although this is an OTT bit.

Consider adding d [i] to y [i] to get s [i], a smoothed value. We strive to minimize

 S = Sum{ 1<=i<N-1 | sqr( s[i+1]-2*s[i]+s[i-1] } + f*Sum{ 0<=i<N | sqr( d[i])}

The first term is the sum of the squares of the (approximate) second derivative of the curve, and the second term fines the transition from the original. f is a (positive) constant. Little algebra repeats this as

 S = sqr( ||A*d - b||)

where the matrix A has a good structure and, indeed, A '* A is penta-diagonal, which means that the normal equations (i.e. d = Inv (A' * A) * A '* b) can be effectively solved, Note that d is calculated directly; there is no need to initialize it.

Given the solution d of this problem, we can calculate the solution d ^ for the same problem, but with the restriction One '* d = 0 (where One is the vector of all units) like this

 d^ = d - (One'*d/Q) * e e = Inv(A'*A)*One Q = One'*e

What value should be used for f? Well, a simple approach is to try this procedure on sample curves for different fs and choose a value that looks good. Another approach is to select a smoothness estimate, for example, the rms of the second derivative, and then the value that should reach, and then search f, which yields that value. As a rule, the larger f, the less smooth the smooth curve will be.

Some motivation for all this. The goal is to find a “smooth” curve “close” to a given one. To do this, we need a measure of smoothness (first term in S) and a measure of proximity (second term. Why are these measures? Well, each of them is easy to calculate and each of them is quadratic in the variables (d []), this will mean that the problem becomes instance of linear least squares for which there are efficient algorithms.Moreover, each term in each sum depends on the close values of the variables, which in turn means that the "inverse covariance" (A '* A) will have a strip structure, and therefore the problem least squares can be effectively resolved A. Why introduce f? Well, if we did not have f (or did not have 0), we could minimize S by setting d [i] = -y [i], getting a perfectly smooth curve s [] = 0, which has nothing to do with the y curve. On the other hand, if f is gigantic, then to minimize s we need to focus on the second term and set d [i] = 0, and our “smoothed” curve is only the original. that with a change in f, the corresponding solutions will change b between very smooth, but far from y (small f) and close to y, but slightly rough (larger mi f).

It has often been said that the normal equations, the use of which I advocate here, are a poor way to solve least-squares problems, and this is usually true. However, when using "good" banded systems, as here, the loss of stability using normal equations is not so great, and the gain in speed is so great. I used this approach for smooth curves with many thousands of points in a reasonable amount of time.

To find out what A is, consider the case when we had 4 points. Then our expression for S reduces to the following:

 sqr( s[2] - 2*s[1] + s[0]) + sqr( s[3] - 2*s[2] + s[1]) + f*(d[0]*d[0] + .. + d[3]*d[3]).

If we substitute s [i] = y [i] + d [i], then we obtain, for example,

 s[2] - 2*s[1] + s[0] = d[2]-2*d[1]+d[0] + y[2]-2*y[1]+y[0]

and therefore we see that for this to be sqr (|| A * db ||), we must take

 A = ( 1 -2 1 0) ( 0 1 -2 1) ( f 0 0 0) ( 0 f 0 0) ( 0 0 f 0) ( 0 0 0 f) and b = ( -(y[2]-2*y[1]+y[0])) ( -(y[3]-2*y[2]+y[1])) ( 0 ) ( 0 ) ( 0 ) ( 0 )

In the implementation, however, you probably would not want to form A and b, since they will only be used to form the normal terms of the equation A '* A and A' * b. It would be easier to accumulate them directly.

Curve smoothing algorithm while keeping constant under it

More articles: