It depends on the type of neural network. When developing this type, you usually indicate the number of input neurons; a su cannot supply it with data of arbitrary length. In the case of longer sequences, you need to either crop your data or use a sliding window.
However, some neural networks allow you to process an arbitrary sequence of inputs, for example, a recurrent neural network . The latter seems to be a very good candidate for your problem. Here 's a good article describing the implementation of a certain type of RNN called Long Short-Term Memory , which work great with speech recognition.
source share