When defining prototxtin caffe, I sometimes used Softmaxas the last type of layer, sometimes we use SoftmaxWithLoss, I know that the level Softmaxwill return the probability that the input data will belong to each class, but it seems that SoftmaxWithLossit will also return the probability of the class, then what is the difference between them? or did I misunderstand the use of two types of layers?
source
share