Shuffle training data using LSTM RNN

Since the LSTM RNN uses previous events to predict current sequences, why do we shuffle the training data? Don't we lose the temporary order of training data? How is this still effective in post-training predictions on shuffled training data?

+5
source share
1 answer

In general, when you shuffle workout data (a set of sequences), you shuffle the order in which the sequences are fed to the RNN, you do not shuffle the order in the individual sequences. This is normal to do when your network is stateless:

Nuclear-free case:

Network memory is only retained for the duration of the sequence. Learning sequence B before sequence A does not matter because the state of the network memory is not stored in the sequence.

On the other hand:

Stealth:

Network memory is stored in sequences. Here you cannot blindly shuffle your data and expect optimal results. Sequence A must be submitted to the network before sequence B, because A reaches B, and we want the network to evaluate sequence B with the memory of what was in sequence A.

+7
source

Source: https://habr.com/ru/post/1269268/


All Articles