I often read that there are fundamental differences between root and recurrent neural networks (RNNs) due to the lack of an internal state and, therefore, short-term memory in direct transfer networks. It seemed plausible to me at first sight.
However, when studying a recurrent neural network with the Backpropagation through time algorithm, the repeating networks are converted to equivalent direct transmission networks, if I understand correctly.
This would mean that there really is no fundamental difference. Why do RNNs perform better in certain tasks (image recognition, time series forecasting, ...) than deep feed networks?
source share