I want to identify lego bricks to build a lego sorting machine (I use C ++ with opencv). This means that I have to distinguish between objects that look very similar.
Bricks fit my camera individually on a flat conveyor. But they could be in any way possible: upside down, sideways, or "normal."
My approach is to teach the sorting machine to bricks by sticking them on the camera in many different positions and turns. The features of each species are computed using surf-algorythm.
void calculateFeatures(const cv::Mat& image, std::vector<cv::KeyPoint>& keypoints, cv::Mat& descriptors) { // detector == cv::SurfFeatureDetector(10) detector->detect(image,keypoints); // extractor == cv::SurfDescriptorExtractor() extractor->compute(image,keypoints,descriptors); }
If there is an unknown brick (the brick that I want to sort), its functions are also calculated and compared with the known ones. To find incorrectly matched functions, I continue as described in the OpenCV 2 Cookbook:
matching (= cv :: BFMatcher (cv :: NORM_L2)) search for two nearest neighbors in both directions
matcher.knnMatch(descriptorsImage1, descriptorsImage2, matches1, 2); matcher.knnMatch(descriptorsImage2, descriptorsImage1, matches2, 2);
I check the relationship between the distances of the nearest neighbors found. If the two distances are very similar, it is likely that a false value is being used.
// loop for matches1 and matches2 for(iterator matchIterator over all matches) if( ((*matchIterator)[0].distance / (*matchIterator)[1].distance) > 0.65 ) throw away
Finally, only symmetric pairs of pairs are accepted. These are coincidences in which not only n1 is the closest neighbor to the function f1, but also f1 is the closest neighbor to n1.
for(iterator matchIterator1 over all matches) for(iterator matchIterator2 over all matches) if ((*matchIterator1)[0].queryIdx == (*matchIterator2)[0].trainIdx && (*matchIterator2)[0].queryIdx == (*matchIterator1)[0].trainIdx)
Now there are only good matches. To filter out a few worse matches, I check which matches correspond to the projection of img1 onto img2 using the fundamental matrix.
std::vector<uchar> inliers(points1.size(),0); cv::findFundamentalMat( cv::Mat(points1),cv::Mat(points2), // matching points inliers, CV_FM_RANSAC, 3, 0.99); std::vector<cv::DMatch> goodMatches // extract the surviving (inliers) matches std::vector<uchar>::const_iterator itIn= inliers.begin(); std::vector<cv::DMatch>::const_iterator itM= allMatches.begin(); // for all matches for ( ;itIn!= inliers.end(); ++itIn, ++itM) if (*itIn) // it is a valid match
The result is pretty good. But in cases of extreme rarity, malfunctions still occur.
In the picture above you can see that such a brick is well known.
However, the second figure recognizes the wrong brick.
Now the question is how to improve compliance.
I had two different ideas:

Matches in the second image return to functions that are really suitable, but only if the visual field changes a lot. To recognize a brick, I still have to compare it in many different positions (at least as shown in Figure 3). This means that I know that I am allowed to change the visual field minimally. Information about how much the visual field is changing should be hidden in the fundamental matrix. How can I read from this matrix how much the position in the room has changed? Particularly interesting is the rotation and strong scaling; if the brick is once glued further to the left side, it does not matter.
Second idea:
I calculated the main matrix of 2 shots and the filtered functions that do not correspond to the projections. Shouldn't there be a way to do the same using three or more images? (keyword trifocal tensor). Thus, compliance should become more stable. But I do not know how to do this using OpenCV, and I can not find information about it in google.