How to prepare a data set for speech recognition

Question

How to prepare a data set for speech recognition

I need to train a bi-directional LSTM model for discrete speech recognition (individual numbers from 0 to 9). I recorded a speech from 100 speakers. What should I do next? (Suppose I split them into separate .wav files containing one number per file). I will use mfcc as functions for the network.

In addition, I would like to know the difference in the data set if I am going to use a library that supports CTC (temporary classification Connectionist)

+5

unsupervised-learning speech-recognition recurrent-neural-network

udani Dec 26 '15 at 16:41

source share

1 answer

Nirbhay Tandon · Accepted Answer · 2016-01-15T13:53:41+0000

You can use the provided answer / guide here

, LSTM (pybrain, theano, keras), .

Theano (Binary LSTM ) Keras (Tutorial), .

, .

How to prepare a data set for speech recognition

More articles: