Image Recognition - Representing a Binary Descriptor in Mat - OpenCV Android

I'm just curious. I am new here, so please consider my question with a few nobles.

Let's say I'm making an Android application with pattern recognition, where all processes, even computationally intensive, must happen on the processor of a mobile device.

I am at the stage where I already processed the images, some functions were extracted from the image. A set of images comes from only one building, where it must recognize specific objects of interest (different windows, images, artifacts, outside the building). Thus, this is a closed domain, and I can provide enough images of objects from different angles. I plan to train the neural network and provide it to the application instead of the image matching algorithm.

My idea is to extract the key points, compute the descriptors (using FREAK for the ORB key points for the descriptors), and from these descriptors I would like to get a single file or array that will end up with something like this

Desc1 Desc2 Desc3 Desc4 DescN......... Class _________________________________________________________________________________ Picture 1 0.121 0.923 0.553 0.22 0.28 "object1" Picture 2 0.22 0.53 0.54 0.55 0.32 .........."object1" (different scale, angle) Picture 3 .... ... ... ... .. .........."object2" Picture N Picture N+1 

therefore, I can attribute it to a neural network for training, however I am stuck as I have no idea how the binary function / descriptor is presented in Matrice (Class Mat - openCV). and how would I continue to normalize these binary descriptors, so I can feed it in Neural Net (Multi-Layer Perceptron) for training. (even pseudo code helped a lot)

+4
source share
2 answers

I cannot give a complete answer to your question because I am not familiar with Neuronal Networks, but I can give you some ideas about the binary representation of ORB descriptors.

  • When you discover key points, you cannot do this with FREAK. But since the FREAK document describes, you must detect key points with the FAST angle detector, and then describe it with FREAK. If you want to recognize objects using ORB descriptors, you must use ORB for both, to detect key points and to describe. Note that ORB point detection can also be based on FAST. You can change it by changing the scoreType parameter from the OpenCV documentation. When you use android, you can set this parameter as described here

  • About binary string descriptors. I also needed them to implement a descriptor pairing with a MySQL query. Since Mat in OpenCV-java has only a view with two descriptors, I implemented a method to convert them to binary. For this, the Descriptor Matrix must be converted to a List<Double> . And you can use my function to get the binary descriptors. The function will return List<String> .

Here is the code:

 public static List<String> descriptorToBinary(List<Double> desc){ List<String> binary_desc = new ArrayList<String>(); String desc_bin= ""; for(int i = 0; i < desc.size(); i++){ String binary_str_tmp = Integer.toBinaryString((int)((double)desc.get(i))); if (binary_str_tmp.length() < 16) { int number_of_zeros = 16 - binary_str_tmp.length(); String str_tmp = ""; for(int t = 0; t < number_of_zeros; t++){ str_tmp += "0"; } binary_str_tmp = str_tmp + binary_str_tmp; } desc_bin+= binary_str_tmp; binary_desc.add(final_binary_str); } return binary_desc; } 

The returned list of strings will be the same size as the MatOfKeyPoint list if you convert it to List<KeyPoint>

So, as I checked if these descriptors are correct:

  • I matched the original Mat descriptors with a Bruteforce Hamming match, as was said in the ORB document
  • I registered the distances returned by the connector.
  • Then I calculated the distances between the line descriptors of the same image.
  • Checked if opencv Hamming distances were the same as line descriptor distances. They were the same, so the conversion from Mat to List was well done.

Thus, the binary descriptors associated with the key points will look like this:

 Picture 1: object1 keypoint1 : 512bit binary descriptor (1s and 0s) keypoint2 : 512bit binary descriptor keypoint3 : 512bit binary descriptor ... Picture 2: object2 keypoint1 : 512bit binary descriptor keypoint2 : 512bit binary descriptor keypoint3 : 512bit binary descriptor ... 

Now about the multilayer perceptron. I can not help it. That is why I said at the beginning that my answer is incomplete. But I hope that the comments I gave will help you solve your problem in the future.

+2
source

Instead of trying to implement a classifier from scratch. Have you considered HaarTraining ?. You can teach him how to detect several objects in an image.

The learning process is long.

http://note.sonots.com/SciSoftware/haartraining.html

Hope this helps!

+2
source

Source: https://habr.com/ru/post/1469112/


All Articles