Measure the distance to an object with a single camera in a static scene

let's say I place a small object on a flat floor inside a room.

  • First step: photograph the floor of the room from a known static position in the world coordinate system.
  • Second step: Detection of the lower edge of the object in the image and mapping the pixel coordinates to the position of the object in the world coordinate system.
  • Third step: Using a measuring tape, measure the actual distance to the object.

I could move a small object, repeat these three steps for each pixel coordinate and create a lookup table (key: pixel coordinate, value: distance). This procedure is accurate enough for my use. I know this is problematic if there are multiple objects (an object may span another object).

My question is: is there an easier way to create this lookup table? Accidentally changing the camera angle by several degrees destroys hard work.;)

Perhaps it is possible to perform three steps for several specific pixel coordinates or positions in the world coordinate system and perform some “calibration” to calculate distances with calculated parameters?

+6
source share
4 answers

If the floor is flat, its equation is equal to the equation of the plane, let

ax + by + cz = 1 

in the coordinates of the camera (the source is the optical center of the camera, XY forms the focal plane and Z is the viewing direction).

Then the beam from the center of the camera to the point in the image in pixel coordinates (u, v) is determined

 (u, v, f).t 

where f is the focal length.

The beam hits the plane when

 (au + bv + cf) t = 1, 

i.e. at the point

 (u, v, f) / (au + bv + cf) 

Finally, the distance from the camera to the point

 p = √(u² + v² + f²) / (au + bv + cf) 

This is the function you need to enter in the table. Assuming f known, you can determine the unknown coefficients a , b , c by taking three unaligned points, measuring the image coordinates (u, v) and distances, and solving a system of 3x3 linear equations.

From the last equation you can estimate the distance for any point in the image.

Focal length can be measured (in pixels) by viewing a target of a known size at a known distance. By proportionality, the ratio of the distance in size f along the length of the image.

+5
source

Most view libraries (including OpenCV) have built-in functions that will take a pair of points from the reference frame camera and corresponding points from the Cartesian plane and generate your curvature matrix (affine transformation) for you. (some of them are good enough to include non-linear displays with enough input points, but this brings you back to your time to calibrate the problem).

One final note: most vision libraries use some type of grid to calibrate, i.e. If you wrote your calibration to work out such a sheet, you only need to measure the distances to one target object, since the transformations will be calculated on the sheet, and the target will simply provide world offsets.

+2
source

I believe that what you need is called Projective Transformation . Below is a link to what you need.

Demonstration of calculating projective transformation using the correct mathematical layout on mathematical SE.

Although you can solve it manually and write it into your code ... I strongly recommend using a mathematical matrix library or even writing your own mathematical mathematical functions before resorting to manually calculating the equations, since you have to solve them symbolically turning it into code , and it will be very expansive and prone to miscalculation.

Here are some tips to help you clarify (by applying it to your problem):

-Your Matrix (source) is built from 4 points in the image of your camera (pixel arrangement).

- Your matrix B (destination) is built from your measurements in the real world.

- For quick recalibration, I suggest marking points on the ground to be able to quickly put the cube in 4 locations (and then get the changed pixel locations in the camera) without redoing.

-You will only have to perform steps 1-5 (once) during calibration, after which whenever you want to know the position of something, just get the coordinates in the image and follow them through step 6 and step 7.

-You want your calibration points to be as far from each other as possible (within reasonable limits, since at extreme distances in a situation of a converging moment you begin to quickly lose pixel density and, therefore, the accuracy of the source image). Make sure that no 3 points are colinear (just put, make your 4 points approximately square in the almost full range of your fov camera in the real world)

ps I apologize for not writing this here, but they have fantastic math editing and it looks cleaner!

Final steps to apply this method to this situation:

To perform this calibration, you will need to set the global home position (most likely, do it arbitrarily on the floor and measure the position of the camera relative to this point). From this position, you will need to measure the distance of your object from this position in the x and y coordinates on the floor. Although a denser-packed calibration kit will give you more errors, the easiest solution for this would be to just have a dimensional sheet (I think a piece of paper for the printer or a big board or something else). The reason this will be easier is because it will have built-in axes (i.e., both sides will be orthogonal and you will just use the four corners of the object and use the set distances in your calibration). EX: for a piece of paper your points will be (0,0), (0,8,5), (11,8,5), (11,0)

Thus, using these points and pixels that you get will create your transformation matrix, but it still just gives you a global x, y position on axes that can be difficult to measure (they may be distorted depending on how you measured / calibrated). Therefore, you will need to calculate the offset of your camera:

object in real worlds (from the steps above): x1, y1 cameras (Xc, Yc)

dist = sqrt (pow (x1-Xc, 2) + pow (y1-Yc, 2))

If it is too cumbersome to try to manually measure the position of the camera from a global source, you can instead measure the distance to 2 different points and transfer these values ​​to the above equation to calculate the offset of your camera, which you will then save and use at any time. when you want to get the final distance.

+2
source

As mentioned in previous answers, you will need projective transformation or just homography. Nevertheless, I will consider this from a more practical point of view and try to summarize it brief and simple.

So, given the correct homography, you can deform your photo of the plane so that it looks like you took it from above (for example, here ). Even easier, you can convert the pixel coordinate of your image to the world coordinates of the plane (the same thing is done during deformation for each pixel).

Homography is basically a 3x3 matrix, and you transform the coordinate by multiplying it by the matrix. Now you might think, wait for 3x3 matrices and 2D coordinates: you will need to use uniform coordinates .

However, most frameworks and libraries will do this processing for you. You need to find (at least) four points (x / y-coordinates) on your world plane / gender (preferably the corners of the rectangle aligned with your desired world coordinate system), photograph them, measure the pixel coordinates and are transmitted as "finding- homographic function "of your desired computer vision and math library.

In OpenCV, which will findHomography , here is an example (then the perspectiveTransform method does the actual conversion).

In Matlab, you can use something from here . Make sure you use projective transformation as a type of transformation. The result is a projective tform that you can use in conjunction with the this method to transform your points from one coordinate system to another.

To convert to another direction, you just need to invert your homography and use the result.

+1
source

Source: https://habr.com/ru/post/1013093/


All Articles