The encoding I am focusing on is the encoding for fish, as I have shown that I have the best results with my work. Therefore, I want to test the encoding on my extracted (SIFT) functions and check the system performance with or without encoding.
Instead of starting fresh, I found that vl_feat has a built-in coding library for fishermen, and they have a tutorial for this, as well as related here
Now I have already completed most of what is required, but what is actually encoded is confusing, for example, the manual explains that fish coding is performed using parameters obtained by GMM, such as [means, covariances, priors] and SIFT functions should be used here in GMM according to the tutorial:
Coding Fisher uses GMM to create a dictionary. To illustrate the construction of GMM, consider a series of two-dimensional data points. In practice, these items will be a collection of SIFT or other local image functions.
numFeatures = 5000 ; dimension = 2 ; data = rand(dimension,numFeatures) ; numClusters = 30 ; [means, covariances, priors] = vl_gmm(data, numClusters);
Then, as soon as I complete this step, should I encode another dataset? It bothers me. I have already used my extracted SIFT functions to generate parameters for GMM.
Then we create another random set of vectors that must be encoded using the Fisher Vector view and the GMM just obtained:
encoding = vl_fisher(datatoBeEncoded, means, covariances, priors);
So, here encoded is the end result, but WHAT encodes it? I need my SIFT functions, which I extracted from my images for encoding, but if I follow the manual used by GMM. If so, what is datatoBeEncoded ? Can I use SIFT feats again?
thanks
Update
@Shai
Thank you, but I believe that I should do something wrong. I donβt quite understand what you mean by "comparing images with yourself." I have 4 classes, from each class 1000 images. So I used the first 600 images from class 1 to find out the gmm parameters, and then use these parameters to encode fishing vectors
numClusters = 128 ; [means, covariances, priors] = vl_gmm(data, numClusters);
So each means, covariances has a size of 128 x 128 and priors of a size of 1 x 128
Now that I use them to encode a fisher vector on 400 images using the function
encoding = vl_fisher(datatoBeEncoded, means, covariances, priors);
the size of the encoding is very different from the size of 12000 x 1 . They cannot be compared with generated models.
I already had a system that worked on an unencrypted version of the dataset, and it worked well, but I wanted to see how coding would matter, theoretically the results should be improved.
I can add the code here if necessary, but it is for UBM-GMM, and the reason I got confused is that the training method you talked about is what I use for UBM.
If I simply encode test images, I cannot use them in the classifier due to size mismatch.
Perhaps I did not choose this correctly or made some kind of stupid mistake, is it possible to get a simple example with which I can understand the work.
thanks a lot