Where to add an exception to the neural network?

I saw a description of the disappearance in different parts of the neural network:

  • loss in the weight matrix,

  • loss in a hidden layer after matrix multiplication and before relu,

  • loss in a hidden layer after relu,

  • and dropout in output to softmax function

I'm a little confused about where I have to drop out. Can anyone help to figure this out? Thanks!

+5
source share
1 answer

So:

  • The first use you described is called weight loss.
  • The second and third uses that you described are the same, and they are usually described as a shutdown during activations. It is easy to see that it can be represented in terms of screening on the scales when the entire row (or column, depending on implementation) is disabled.
  • In the 4th case, this is not the correct use of the cutoff - the layer you want to use for sifting is the output layer - so it is not a good idea to use the cutoff there.
+4
source

Source: https://habr.com/ru/post/1259288/


All Articles