The loss function is how you punish your result.
The following example is for a controlled parameter, that is, when you know what the correct result should be. Although the loss functions can be applied even in uncontrolled settings.
Suppose you have a model that always predicts 1. Just a scalar value of 1.
You may have many loss functions applied to this model. L2 is the Euclidean distance.
If I pass some say 2 value and I want my model to learn the function x ** 2, then the result should be 4 (because 2 * 2 = 4). If we apply the L2 loss, then calculate it as || 4 - 1 || ^ 2 = 9.
We can also create our own loss function. We can say that the loss function is always equal to 10. Thus, regardless of which model we derive, the loss will be constant.
Why do we care about loss functions? Well, they determine how poorly the model worked in the context of backpropagation and neural networks. They also determine the gradients from the final layer that must be propagated for the model to learn.
Like the other comments, I think you should start with the main material. Here's a good link to start at http://neuralnetworksanddeeplearning.com/
source share