What is the difference between a layer of Softmax and SoftmaxWithLoss in coffee?

When defining prototxtin caffe, I sometimes used Softmaxas the last type of layer, sometimes we use SoftmaxWithLoss, I know that the level Softmaxwill return the probability that the input data will belong to each class, but it seems that SoftmaxWithLossit will also return the probability of the class, then what is the difference between them? or did I misunderstand the use of two types of layers?

+4
source share
1 answer

While it Softmaxreturns the probability of each target class when predicting the model, it SoftmaxWithLossnot only applies the softmax operation to the predictions, but also calculates the multi-component logistic loss returned as a result. This is fundamental to the training phase (without loss there will be no gradient that can be used to update network parameters).

See SoftmaxWithLossLayer and Caffe Loss for more information.

+3
source

Source: https://habr.com/ru/post/1662838/


All Articles