Correspondence of two series of coefficients Mfcc

I extracted two rows of MFCCs from two about 30 second audio files consisting of the same speech content. Audio files are recorded in one place from different sources. It is necessary to evaluate whether the sound contains the same conversation or another conversation. Currently, I have checked the correlation calculation of two Mfcc series, but the result is not very reasonable. Are there any better methods for this scenario?

+2
source share
3 answers

I had the same problem, and the solution for it was to map two MFCC arrays using the Dynamic Time Warping algorithm .

After calculating the MFCC, you should have an array for each of the two signals, in which each element contains the MFCC for the frame (array of arrays). The first step is to compute the "distances" between each element of one array and each one element of the other, that is, the distance between each two sets of MFCC (you can try using Euclidian Distance for this).

This should leave you with a two-dimensional array (let him call it "dist"), where the element (i, j) represents the distance between the MFCC of the i-th frame in the first signal and the MFCCs j-th frame of your second signal.

In this array, you can now apply the DTW algorithm:

  • dtw (1,1) = dist (1,1)
  • dtw (i, j) = min (dtw (i-1, j-1), dtw (i-1, j), dtw (i, j-1)) + dist (i, j).

The value representing the "difference" between your two files is dtw (n, m), where n = nr. frames in the first signal, m = nr. frames of the second.

For further reading, this document may give you an overview of the application of DTW to MFCC, and this presentation of the DTW algorithm may also help.

+4
source

Since these two vectors are effective histograms, you can try to calculate the distance between the squares between the vectors (a common measure of the distance for histograms).

d(i) = sum (x(i) - y(i))^2/(2 * (x(i)+y(i))); 

In this toolbar you can find a good implementation (mex):

http://www.mathworks.com/matlabcentral/fileexchange/15935-computing-pairwise-distances-and-metrics

Call the following:

 d = slmetric_pw(X, Y, 'chisq'); 
+3
source

Recently, I have encountered the same problem. The best way I've found is to use the MIRtoolbox audio library, which is very effective in terms of sound processing.

After adding this library, the distance of two MFCCs can be easily calculated by calling (shorter distance <=> similar matches):

 dist = mirgetdata(mirdist(mfcc1, mfcc2)); 
+1
source

Source: https://habr.com/ru/post/1385361/


All Articles