How to get audio frequency data from a wave file?

I want to create a speech recognition mechanism in ruby. I know that I will never get there, doing it just for fun. I need to get data for the frequencies of the sound stored in a wav file, for comparison with the data that I already have from different sounds that I want to recognize. I will write the code in ruby, but I don’t think there are any libraries for this written in ruby, they would be too slow if they were anyway. The good thing about ruby ​​is that I can use libraries for .net through IronRuby or Java through Jruby. How can I get frequency data?

+3
source share
2 answers

The wave file is not too complicated, in fact it is just a series of audio tapes: http://www-mmsp.ece.mcgill.ca/Documents/AudioFormats/WAVE/WAVE.html .

Once you can read the samples, the next step would be to run them using the FFT conversion to get the frequency content. There must be some kind of open source implementation that you can use, or you can implement it yourself.

What you are trying to do requires some understanding of sound and the math behind signal processing, so you might want to start with a book on this subject.

+3
source

. . ( , ), :

  • ( -, , ). ( , , ). ( ) ( ). , -, . , MFCC ( Mel), .

  • (- GMM, SVM...), . , , , .

0

Source: https://habr.com/ru/post/1742578/


All Articles