So, from what I read, it seems you would like to use Dynamic Time Warping (DTW) . Of course, I will leave an explanation for Wikipedia, but it is usually used to recognize speech patterns without receiving noise from different pronunciations.
Unfortunately, I am more knowledgeable in C, Java and Python. Therefore, I will offer python libraries.
With rpy2, you can actually use the R library and use their DTW implementation in your Python code. Unfortunately, I could not find good tutorials for this, but there are good examples if you decide to use R.
Please let me know if this does not help, Cheers!
source share