How to save a hierarchical tree of K-Tools for a large number of images using Opencv?

I am trying to create a program that will find similar images from a dataset. Steps

  • extract SURF descriptors for all images
  • save descriptors
  • Apply knn on saved descriptors
  • Map saved handles to the request image descriptor using kNN

Now, each SURF image descriptor will be saved as a tree of hierarchical k-units, now I save each tree as a separate file, or you can build one tree with all image descriptors and update as the images are added to the data set.

This is the document on which I base the program.

+4
algorithm image-processing computer-vision feature-detection
Apr 02 2018-11-11T00:
source share
2 answers

Use KD-Tree instead. You can build a hierarchical K-dimensional tree, you just need to find out what information is sent along the tree that you want to save. You can save vectors / image descriptors to disk, load KD-Tree every time you start your program. New computer vectors / descriptors can be sent to the tree and to disk

To summarize

  • Create descriptors
  • Load new descriptors in KD-Tree
  • Save the same descriptors to disk \
  • Each time you reboot, load all the descriptors into the tree
  • Request a tree to give you the best fit

Hope this helps

+2
Apr 6 2018-12-12T00:
source share

Are you sure you want to do this with SURF descriptors? I just work on a similar application myself, and it is based on this document (Nister, Stewenius), and they swear that SIFT descriptors are the way to go. But, I think you could do this with any other descriptors.

Looking at the document you are referring to is newer than what I am connected with, but it does not refer either to Nister’s paper or to this work (Sivic, Zisserman), which, as far as I know, the database works for all problems of image search based on content.

For a better understanding of the problem, before I started implementing it, I first read Sivic, Zisserman to get a general idea of ​​the system. They apply simple clustering after retrieving all SIFT descriptors from all functions. They use two different types of functions for better accuracy: Shape Adapted (in the center - angular as functions) and Maximally Stable (corresponding drops of high contrast - you can find them in this paper (Matas et al.)). The scalability of their system is not so good due to the direct storage of each function, but they introduced the concept of inverted files, a technique from Text Analytics (you can read about this basics here ), which greatly simplifies the search process.

After conquering this work, I recommend moving on to Nister, Stewenius , where they introduce the concept of hierarchical clustering of k-intermediate levels L to store functions, as well as for the last search of the image database. Now, if I’m not very mistaken, you do not store each descriptor as a separate tree. Instead, you create a tree based on existing functions (where the centers of the clusters at each level actually represent the "central" functions for each cluster). After the tree is built at the required depth (they recommend 10 clusters at 6 levels), the cluster centers at the last level are representative of a very small number of functions - and thus, you can really forget all the original functions! (or at least their descriptors). Each original function can be represented by the corresponding cluster center, and instead of descriptors for each image, you only need to store information about which cluster centers - functions - it contains. This is much simpler because you only need to store one or two integers for each function - encoding the path through the tree. The easiest way to see this is simply to indicate the number of the cluster to which this function belongs at each level - 10 of them (4 bits) - for each level (6 of them, 4 * 6 and 32 bits, so it fits into an integer ) You can, of course, implement the actual coding in any way convenient for you. Oh, and they also use SIFT descriptors in MSER areas.

In addition, if the images that you use to create the vocabulary tree are representative (for example, you are working on a set of image data in open space and build a tree only from the representative part of the images, but you know that there are no photographs of industrial jobs in the rest parts of the data set), you can add new photos very quickly. The only thing you need to do to add any new image to the data set is to determine which of the calculated cluster centers represent the best image functions (as mentioned earlier, the last level of cluster centers is quite accurate) and store information about cluster centers (the previously mentioned integer numbers). The search for cluster centers should be very fast - only 10 comparisons at each of the 6 levels.

Hope this is ever really useful to someone, as the question has been around for years. :)

+6
Apr 6 2018-12-12T00:
source share



All Articles