To answer your question, we need to answer 4 different questions:
1. Define homography. 2. See what happens when noise or outliers are present. 3. Find an approximate solution. 4. Refine it.
Homography in a 3x3 matrix that displays two-dimensional points. The map is linear in uniform coordinates: [x2, y2, 1] ~ H * [x1, y1, 1], where 'means transpose (for writing column vectors as rows) and ~ means that the map corresponds to scale. It is easier to see in Cartesian coordinates (multiplying the nominator and denominator by the same factor does not change the result)
x2 = (h11 * x1 + h12 * y1 + h13) / (h31 * x1 + h32 * y1 + h33)
y2 = (h21 * x1 + h22 * y1 + h23) / (h31 * x1 + h32 * y1 + h33)
You can see that in Cartesian coordinates the display is non-linear, but now just remember that.
We can easily solve the previous set of linear equations in homogeneous coordinates using the least squares linear algebra methods (see DLT - Direct linear transformation), but this, unfortunately, only minimizes the algebraic error in the homography parameters. People care more about another type of error, namely, an error that shifts points in Cartesian coordinate systems. If there is no noise and no emissions, two erros can be identical. However, the presence of noise requires that we minimize residuals in Cartesian coordinates (residuals are only the square differences between the left and right sides of the Cartesian equations). In addition, the presence of emissions requires the use of a Robust method such as RANSAC. He picks the best set of rulers and rejects a few outliers to make sure they don't pollute our decision.
Since RANSAC finds the correct inlays by random trial and error in many iterations, we need a really fast way to calculate homography, and this will be a linear approximation that minimizes parameter error (incorrect metrics), but otherwise close enough to the final solution (which minimizes quadratic point coordinates are correct metrics). We use a linear solution as an assumption for further non-linear optimization;
The last step is to use our initial assumption (solving a linear system that minimizes Homography parameters) when solving non-linear equations (minimizing the sum of squared pixel errors). For example, the reason for using squares of residuals instead of their absolute values โโis that in the Gaussian formula (describing noise) we have the square of exponent exp (x-mu) ^ 2, therefore (omitting some probabilistic formulas) the maximum likelihood solutions require a squared residual.
To perform nonlinear optimization, the Levenberg-Marquardt method is usually used. But as a first approximation, you can simply use gradient descent (note that the gradient points are uphill, but we are looking for a minimum, so we go against it, therefore, the minus sign is lower). In a nutshell, we go to the set of iterations 1..t..N, select the homography parameters at iteration t as param (t) = param (t-1) - k * gradient, where gradient = d_cost / d_param.
Bonus material: to further reduce the noise in your homography, you can try a few tricks: reduce the search space for points (start tracking your points); use different functions (lines, conics, etc., which are also converted by homography, but may have a higher SNR); reject impossible homographs to accelerate RANSAC (for example, those that correspond to "impossible point movements"); Use a low-pass filter for small changes in homographs that may be related to noise.
source share