External calibration cv :: SolvePnP

I'm currently trying to implement an alternative method for webcam based AR using an external tracking system. I have everything that was in my environment, with the exception of external calibration. I decided to use cv::solvePnP() because it supposedly does a lot of what I want, but after two weeks I pull out my hair, trying to make it work. The diagram below shows my configuration. c1 is my camera, c2 is the optical tracker that I use, M is a trackable marker attached to the camera, and ch is a chessboard.

My configuration diagram

In order of things, I pass the pixel coordinates of the image obtained using cv::findChessboardCorners() . World points are acquired with reference to the tracked marker M attached to camera c1 (Thus, the external is a conversion from this marker frame to the beginning of the camera). I tested this with datasets up to 50 points to reduce the likelihood of local minima, but so far I'm only testing four pairs of 2D / 3D points. The resulting external I get from rvec and tvec, returned from cv::solvePnP() , in bulk, both in translation and in rotation relative to both manually created ground truth and basic visual analysis (translation implies a distance of 1100 mm at that while the camera is at a distance of no more than 10 mm).

Initially, I thought that the problem was that I had some ambiguities in how my position in the pose was determined, but now I am sure that this is not so. The math seems pretty simple, and after all my work on setting up the system, the fascination with what is essentially a single liner is a huge disappointment. I honestly have a lack of options, so if someone can help, I will be extremely in your debt. My test code is below and matches my implementation minus some rendering calls. My main truth, which works with my program, is as follows (basically, pure rotation around one axis and a small translation):

 1 0 0 29 0 .77 -.64 32.06 0 .64 .77 -39.86 0 0 0 1 

Thanks!

 #include <opencv2\opencv.hpp> #include <opencv2\highgui\highgui.hpp> int main() { int imageSize = 4; int markupsSize = 4; std::vector<cv::Point2d> imagePoints; std::vector<cv::Point3d> markupsPoints; double tempImage[3], tempMarkups[3]; // Temp variables or iterative data construction cv::Mat CamMat = (cv::Mat_<double>(3,3) << (566.07469648019332), 0, 318.20416967732666, 0, (565.68051204299513), -254.95231997403764, 0, 0, 1); cv::Mat DistMat = (cv::Mat_<double>(5,1) << -1.1310542849120900e-001, 4.5797249991542077e-001, 7.8356355644908070e-003, 3.7617039978623504e-003, -1.2734302146228518e+000); cv::Mat rvec = cv::Mat::zeros(3,1, cv::DataType<double>::type); cv::Mat tvec = cv::Mat::zeros(3,1,cv::DataType<double>::type); cv::Mat R; cv::Mat extrinsic = cv::Mat::eye(4, 4, CV_64F); // Escape if markup lists aren't equally sized if(imageSize != markupsSize) { //TODO: Replace with try, throw error code, and catch in qSlicerLocalizationModuleWidget return 0; } // Four principal chessboard corners only imagePoints.push_back(cv::Point2d(368.906, 248.123)); imagePoints.push_back(cv::Point2d(156.583, 252.414)); imagePoints.push_back(cv::Point2d(364.808, 132.384)); imagePoints.push_back(cv::Point2d(156.692, 128.289)); markupsPoints.push_back(cv::Point3d(495.115, 39.106, 93.79)); markupsPoints.push_back(cv::Point3d(463.143, -86.286, -51.178)); markupsPoints.push_back(cv::Point3d(500.236, 121.988, 24.056)); markupsPoints.push_back(cv::Point3d(471.276, -3.23, -127.809)); // Larger data set /*imagePoints.push_back(cv::Point2d(482.066, 233.778)); imagePoints.push_back(cv::Point2d(448.024, 232.038)); imagePoints.push_back(cv::Point2d(413.895, 230.785)); imagePoints.push_back(cv::Point2d(380.653, 229.242 )); imagePoints.push_back(cv::Point2d(347.983, 227.785)); imagePoints.push_back(cv::Point2d(316.103, 225.977)); imagePoints.push_back(cv::Point2d(284.02, 224.905)); imagePoints.push_back(cv::Point2d(252.929, 223.611)); imagePoints.push_back(cv::Point2d(483.41, 200.527)); imagePoints.push_back(cv::Point2d(449.456, 199.406)); imagePoints.push_back(cv::Point2d(415.843, 197.849)); imagePoints.push_back(cv::Point2d(382.59, 196.763)); imagePoints.push_back(cv::Point2d(350.094, 195.616)); imagePoints.push_back(cv::Point2d(317.922, 194.027)); imagePoints.push_back(cv::Point2d(286.922, 192.814)); imagePoints.push_back(cv::Point2d(256.006, 192.022)); imagePoints.push_back(cv::Point2d(484.292, 167.816)); imagePoints.push_back(cv::Point2d(450.678, 166.982)); imagePoints.push_back(cv::Point2d(417.377, 165.961)); markupsPoints.push_back(cv::Point3d(457.132, 59.822, 89.247)); markupsPoints.push_back(cv::Point3d(451.634, 42.015, 69.719)); markupsPoints.push_back(cv::Point3d(447.06, 22.927, 48.635)); markupsPoints.push_back(cv::Point3d(442.424, 4.454, 28.659)); markupsPoints.push_back(cv::Point3d(437.621, -14.395, 7.495)); markupsPoints.push_back(cv::Point3d(433.386, -33.034, -12.009)); markupsPoints.push_back(cv::Point3d(429.227, -51.001, -32.269)); markupsPoints.push_back(cv::Point3d(424.291, -70.266, -52.667)); markupsPoints.push_back(cv::Point3d(460.300, 79.769, 69.948)); markupsPoints.push_back(cv::Point3d(455.020, 61.379, 49.306)); markupsPoints.push_back(cv::Point3d(450.501, 43.288, 30.250)); markupsPoints.push_back(cv::Point3d(446.062, 24.572, 8.713)); markupsPoints.push_back(cv::Point3d(441.346, 5.823, -10.997)); markupsPoints.push_back(cv::Point3d(436.670, -13.135, -31.428)); markupsPoints.push_back(cv::Point3d(432.367, -31.428, -51.785)); markupsPoints.push_back(cv::Point3d(427.745, -50.016, -72.519)); markupsPoints.push_back(cv::Point3d(464.824, 101.129, 52.251)); markupsPoints.push_back(cv::Point3d(458.628, 81.864, 30.810)); markupsPoints.push_back(cv::Point3d(454.120, 63.546, 10.458)); */ // Calculate camera pose cv::solvePnP(cv::Mat(markupsPoints), cv::Mat(imagePoints), CamMat, DistMat, rvec, tvec); cv::Rodrigues(rvec, R); // Invert results of Rodrigues by transposing rotation matrix and calculating inverted tvec R = Rt(); tvec = -R * tvec; // translation of inverse std::cout << "Tvec = " << std::endl << tvec << std::endl; // Copy R and tvec into the extrinsic matrix extrinsic( cv::Range(0,3), cv::Range(0,3) ) = R * 1; extrinsic( cv::Range(0,3), cv::Range(3,4) ) = tvec * 1; // Fill last row of homogeneous transform (0,0,0,1) double *p = extrinsic.ptr<double>(3); p[0] = p[1] = p[2] = 0; p[3] = 1; std::cout << "Extrinsic = " << std::endl << extrinsic << std::endl << std::endl; std::cout << "Extrinsic (inv) = " << std::endl << extrinsic.inv() << std::endl; std::cin >> tempImage[0]; return 0; } 

EDIT 1: I tried to normalize the pixel values ​​using the Chi Xu method (xn = (x-cx) / f, yn = (y-cy) / f). Bad luck: (

EDIT 2: It seems that almost everyone who uses solvePnP uses a method in which they mark the corners of a chessboard as vectors for their world frame and origin, I'm going to try to register my tracker on a chessboard. If this works as expected, the first coordinate I will be approximately <0,0>. The resulting transform from solvePnP will then be multiplied by the inverse transform of the marker from the world to the camera, resulting in (hopefully) an external matrix.

EDIT 3: I followed the steps described in step 2. After setting the coordinate system on the chessboard, I calculated the cTw transformation from the chessboard space to world space and checked it in my rendering environment. Then I calculated an external mTc representing the conversion from camera space to checkerboard space. The wTr camera marker conversion was obtained from a tracking system. Finally, I took all the transformations and multiplied them as follows to move my transform completely from the beginning of the camera to the camera marker:

mTc*cTw*wTr

So the same thing happened. I am dying here for any signs that I am doing wrong. If someone tells me, I ask you to help.

EDIT 4: Today I realized that I adjusted the axes of the chessboard so that they were in the left coordinate system. Since OpenCV works using the standard right-hand form, I decided to repeat the tests with my checkerboard frame axes configured in my right hand. While the results were about the same, I noticed that the world-chessboard transformation has now become a non-standard transformation format, i.e. The 3x3 rotation matrix was no longer approximately symmetrical on the diagonal, as it should be. This indicates that my tracking system is not using the right coordinate system (it would be great if they documented this. Or something, for that matter). Although I am not sure how to solve this, I am sure that this is part of the problem. I will continue to beat him and hope that someone here knows what to do. I also added the diagram presented to me on the OpenCV, Eduardo boards. Thanks Eduardo!

+5
source share
1 answer

So, I understood the problem. Not surprisingly, this was in implementation, and not in theory. When the project was originally developed, we used a webcam for tracking. This tracker had a z axis extending from the plane of the marker. When we switched to the optical tracker, the code was mainly ported as is. Unfortunately for us (RIP 2 months in my life), we never checked whether the z axis remained out of the marker plane (this is not the case). Thus, the visualization pipeline was assigned the wrong view and view vectors on the scene camera. It basically works now, except that translations are for some reason disabled. A completely different problem though!

+3
source

Source: https://habr.com/ru/post/1232566/


All Articles