I am trying to use OpenCV to accomplish some basic augmented reality. As I do this, use findChessboardCorners
to get a set of points from the camera image. Then I create a 3D quadrant along the z = 0 plane and use solvePnP
to get homography between the displayed points and the flat points. From this, I believe that I would have to create a model-view matrix that would allow me to display the cube with the correct position on top of the image.
The documentation for solvePnP
says that it produces a rotation vector that (along with the [translation vector]) brings points from the model coordinate system to the camera coordinate system. "I think the opposite is what I want; since my square is on the plane z = 0, I need a model matrix that transforms this square into the corresponding three-dimensional plane.
I thought that by doing the opposite rotations and translations in the reverse order, I could calculate the correct lookup matrix, but that didn't seem to work. Although the rendered object (cube) moves with the camera image and, apparently, is roughly correct for translation, rotation simply does not work; it is on several axes, when it should rotate only on one, and sometimes in the wrong direction. Here is what I am doing so far:
std::vector<Point2f> corners; bool found = findChessboardCorners(*_imageBuffer, cv::Size(5,4), corners, CV_CALIB_CB_FILTER_QUADS | CV_CALIB_CB_FAST_CHECK); if(found) { drawChessboardCorners(*_imageBuffer, cv::Size(6, 5), corners, found); std::vector<double> distortionCoefficients(5); // camera distortion distortionCoefficients[0] = 0.070969; distortionCoefficients[1] = 0.777647; distortionCoefficients[2] = -0.009131; distortionCoefficients[3] = -0.013867; distortionCoefficients[4] = -5.141519; // Since the image was resized, we need to scale the found corner points float sw = _width / SMALL_WIDTH; float sh = _height / SMALL_HEIGHT; std::vector<Point2f> board_verts; board_verts.push_back(Point2f(corners[0].x * sw, corners[0].y * sh)); board_verts.push_back(Point2f(corners[15].x * sw, corners[15].y * sh)); board_verts.push_back(Point2f(corners[19].x * sw, corners[19].y * sh)); board_verts.push_back(Point2f(corners[4].x * sw, corners[4].y * sh)); Mat boardMat(board_verts); std::vector<Point3f> square_verts; square_verts.push_back(Point3f(-1, 1, 0)); square_verts.push_back(Point3f(-1, -1, 0)); square_verts.push_back(Point3f(1, -1, 0)); square_verts.push_back(Point3f(1, 1, 0)); Mat squareMat(square_verts); // Transform the camera intrinsic parameters into an OpenGL camera matrix glMatrixMode(GL_PROJECTION); glLoadIdentity(); // Camera parameters double f_x = 786.42938232; // Focal length in x axis double f_y = 786.42938232; // Focal length in y axis (usually the same?) double c_x = 217.01358032; // Camera primary point x double c_y = 311.25384521; // Camera primary point y cv::Mat cameraMatrix(3,3,CV_32FC1); cameraMatrix.at<float>(0,0) = f_x; cameraMatrix.at<float>(0,1) = 0.0; cameraMatrix.at<float>(0,2) = c_x; cameraMatrix.at<float>(1,0) = 0.0; cameraMatrix.at<float>(1,1) = f_y; cameraMatrix.at<float>(1,2) = c_y; cameraMatrix.at<float>(2,0) = 0.0; cameraMatrix.at<float>(2,1) = 0.0; cameraMatrix.at<float>(2,2) = 1.0; Mat rvec(3, 1, CV_32F), tvec(3, 1, CV_32F); solvePnP(squareMat, boardMat, cameraMatrix, distortionCoefficients, rvec, tvec); _rv[0] = rvec.at<double>(0, 0); _rv[1] = rvec.at<double>(1, 0); _rv[2] = rvec.at<double>(2, 0); _tv[0] = tvec.at<double>(0, 0); _tv[1] = tvec.at<double>(1, 0); _tv[2] = tvec.at<double>(2, 0); }
Then in the drawing code ...
GLKMatrix4 modelViewMatrix = GLKMatrix4MakeTranslation(0.0f, 0.0f, 0.0f); modelViewMatrix = GLKMatrix4Translate(modelViewMatrix, -tv[1], -tv[0], -tv[2]); modelViewMatrix = GLKMatrix4Rotate(modelViewMatrix, -rv[0], 1.0f, 0.0f, 0.0f); modelViewMatrix = GLKMatrix4Rotate(modelViewMatrix, -rv[1], 0.0f, 1.0f, 0.0f); modelViewMatrix = GLKMatrix4Rotate(modelViewMatrix, -rv[2], 0.0f, 0.0f, 1.0f);
The vertices that I process create a unit-length cube around the origin (that is, from -0.5 to 0.5 along each edge.) I know that with the OpenGL translation functions, transformations are performed in the โreverse orderโ, so the above above should rotate the cube along z, y, and then x axes, and then translate it. However, it seems that it first translates and then rotates, so maybe Apple GLKMatrix4
works differently?
This question seems very similar to mine, and in particular, coder9's answer seems to be more or less what I'm looking for. However, I tried this and compared the results with my method, and the matrices I came to were the same in both cases. I feel this answer is right, but I do not understand some important details.