Does it make sense to use autoencoder for a batch normalized network?

As you know, the main problem in DNN is a long training time.

But there are several ways to speed up learning:

Batch normalization achieves the same accuracy with 14 times less training steps

  1. ReLU =max(x, 0) - rectified linear unit (ReLU, LReLU, PReLU, RReLU): https://arxiv.org/abs/1505.00853

The advantage of using the unsaturated activation function lies in two aspects: the first is to solve the so-called "exploding / disappearing" gradient, the second is to accelerate the convergence rate .

Or any: (maxout, ReLU-family, tanh)

  1. Quick weight initialization (avoiding fading or exploding gradients): https://arxiv.org/abs/1511.06856

Our initialization corresponds to the current state without supervision or self-controlled methods of preliminary preparation on a standard computer vision of a task, such as classification of images and detection of objects, roughly three orders of magnitude faster .

Or LSUV initialization (Layer-sequential unit-variance): https://arxiv.org/abs/1511.06422

But if we use all the steps: (1) Normalization of the batch, (2) ReLU, (3) Quick weight initialization or LSUV - then does it make sense to use autoencoder / autoassociator at any stage of training a deep neural network?

+4
1

TL;DR

. , , .

RBM . - . . , RBM (G. Hinton .) Autoencoders (Y. Bengio .) .

:

  • . (: ), - .
  • . , (1-) . , . , . , , .

RBM autoencoder . . , , , , .

,

? , ? , , .

, , . , .

+5

Source: https://habr.com/ru/post/1663339/


All Articles