How is homography calculated?

I have quite a few problems understanding the planeโ€™s work with plane homography. In particular, I would like to know how the opencv method works.

Does this look like ray tracing? How is the uniform coordinate different from the scale * of the vector?

Everything that I read says, as you already know, what they are talking about, so itโ€™s hard to understand!

+6
source share
2 answers

Googling homography estimation returns this as the first link (at least for me): http://cseweb.ucsd.edu/classes/wi07/cse252a/homography_estimation/homography_estimation.pdf . And definitely this is a bad description, and much has been omitted. If you want to learn these concepts by reading a good book, such as Multiple View Geometry in Computer Vision , it would be much better than reading short articles. Often these short articles have some serious errors, so be careful.

In short, a cost function is defined, and the parameters (elements of the homography matrix) that minimize this cost function are the answer we are looking for. The significant cost function is geometric, i.e. It has a geometric interpretation. For the case of homography, we want to find H such that, converting points from one image to another, the distance between all points and their correspondences will be minimal. This geometric function is non-linear, which means: 1 - the iterative method should be used to solve it, in the general case 2 - the initial starting point is required for the iterative method. Here, the functions of algebraic costs are introduced. These cost functions have no meaning / geometric interpretation. Often their design is more of an art, and for a problem you can usually find several functions of algebraic value with different properties. The advantage of algebraic costs is that they lead to linear optimization problems, so there is a closed solution for them (this is a one-shot / no iteration method). But the disadvantage is that the solution found is not optimal. Therefore, the general approach is to first optimize the algebraic value, and then use the solution found as a starting point for iterative geometric optimization. Now, if you google for these cost functions for homography you will find how they are usually defined.

If you want to know which method is used in OpenCV, you just need to take a look at the code: http://code.opencv.org/projects/opencv/repository/entry/trunk/opencv/modules/calib3d/src/fundam.cpp#L81 This is the algebraic function of DLT defined in the mentioned book, if you google homography DLT should find some relevant documents. And here: http://code.opencv.org/projects/opencv/repository/entry/trunk/opencv/modules/calib3d/src/fundam.cpp#L165 The iterative procedure minimizes the geometric cost function. It seems that the Gauss-Newton method is implemented: http://en.wikipedia.org/wiki/Gauss%E2%80%93Newton_algorithm

All of the above discussion assumes that you have a match between the two images. If some points are mapped to incorrect points of another image, then you have outliers, and the results of these methods will be completely disabled. Here reliable methods are introduced (against emissions). OpenCV provides you with two options: 1.RANSAC 2.LMeDS. Google is your friend here.

Hope this helps.

+13
source

To answer your question, we need to answer 4 different questions:

 1. Define homography. 2. See what happens when noise or outliers are present. 3. Find an approximate solution. 4. Refine it. 

  • Homography in a 3x3 matrix that displays two-dimensional points. The map is linear in uniform coordinates: [x2, y2, 1] ~ H * [x1, y1, 1], where 'means transpose (for writing column vectors as rows) and ~ means that the map corresponds to scale. It is easier to see in Cartesian coordinates (multiplying the nominator and denominator by the same factor does not change the result)

    x2 = (h11 * x1 + h12 * y1 + h13) / (h31 * x1 + h32 * y1 + h33)

    y2 = (h21 * x1 + h22 * y1 + h23) / (h31 * x1 + h32 * y1 + h33)

    You can see that in Cartesian coordinates the display is non-linear, but now just remember that.

  • We can easily solve the previous set of linear equations in homogeneous coordinates using the least squares linear algebra methods (see DLT - Direct linear transformation), but this, unfortunately, only minimizes the algebraic error in the homography parameters. People care more about another type of error, namely, an error that shifts points in Cartesian coordinate systems. If there is no noise and no emissions, two erros can be identical. However, the presence of noise requires that we minimize residuals in Cartesian coordinates (residuals are only the square differences between the left and right sides of the Cartesian equations). In addition, the presence of emissions requires the use of a Robust method such as RANSAC. He picks the best set of rulers and rejects a few outliers to make sure they don't pollute our decision.

  • Since RANSAC finds the correct inlays by random trial and error in many iterations, we need a really fast way to calculate homography, and this will be a linear approximation that minimizes parameter error (incorrect metrics), but otherwise close enough to the final solution (which minimizes quadratic point coordinates are correct metrics). We use a linear solution as an assumption for further non-linear optimization;

  • The last step is to use our initial assumption (solving a linear system that minimizes Homography parameters) when solving non-linear equations (minimizing the sum of squared pixel errors). For example, the reason for using squares of residuals instead of their absolute values โ€‹โ€‹is that in the Gaussian formula (describing noise) we have the square of exponent exp (x-mu) ^ 2, therefore (omitting some probabilistic formulas) the maximum likelihood solutions require a squared residual.

To perform nonlinear optimization, the Levenberg-Marquardt method is usually used. But as a first approximation, you can simply use gradient descent (note that the gradient points are uphill, but we are looking for a minimum, so we go against it, therefore, the minus sign is lower). In a nutshell, we go to the set of iterations 1..t..N, select the homography parameters at iteration t as param (t) = param (t-1) - k * gradient, where gradient = d_cost / d_param.

Bonus material: to further reduce the noise in your homography, you can try a few tricks: reduce the search space for points (start tracking your points); use different functions (lines, conics, etc., which are also converted by homography, but may have a higher SNR); reject impossible homographs to accelerate RANSAC (for example, those that correspond to "impossible point movements"); Use a low-pass filter for small changes in homographs that may be related to noise.

+5
source

Source: https://habr.com/ru/post/914855/


All Articles