I had the same problem, and the solution for it was to map two MFCC arrays using the Dynamic Time Warping algorithm .
After calculating the MFCC, you should have an array for each of the two signals, in which each element contains the MFCC for the frame (array of arrays). The first step is to compute the "distances" between each element of one array and each one element of the other, that is, the distance between each two sets of MFCC (you can try using Euclidian Distance for this).
This should leave you with a two-dimensional array (let him call it "dist"), where the element (i, j) represents the distance between the MFCC of the i-th frame in the first signal and the MFCCs j-th frame of your second signal.
In this array, you can now apply the DTW algorithm:
- dtw (1,1) = dist (1,1)
- dtw (i, j) = min (dtw (i-1, j-1), dtw (i-1, j), dtw (i, j-1)) + dist (i, j).
The value representing the "difference" between your two files is dtw (n, m), where n = nr. frames in the first signal, m = nr. frames of the second.
For further reading, this document may give you an overview of the application of DTW to MFCC, and this presentation of the DTW algorithm may also help.
source share