What is the loss function in simple words?

Can someone explain in simple words and possibly with some examples what the loss function in machine learning / neural networks is?

This happened when I followed the Tensorflow tutorial: https://www.tensorflow.org/get_started/get_started

+5
source share
3 answers

The loss function is how you punish your result.

The following example is for a controlled parameter, that is, when you know what the correct result should be. Although the loss functions can be applied even in uncontrolled settings.

Suppose you have a model that always predicts 1. Just a scalar value of 1.

You may have many loss functions applied to this model. L2 is the Euclidean distance.

If I pass some say 2 value and I want my model to learn the function x ** 2, then the result should be 4 (because 2 * 2 = 4). If we apply the L2 loss, then calculate it as || 4 - 1 || ^ 2 = 9.

We can also create our own loss function. We can say that the loss function is always equal to 10. Thus, regardless of which model we derive, the loss will be constant.

Why do we care about loss functions? Well, they determine how poorly the model worked in the context of backpropagation and neural networks. They also determine the gradients from the final layer that must be propagated for the model to learn.

Like the other comments, I think you should start with the main material. Here's a good link to start at http://neuralnetworksanddeeplearning.com/

+6
source

It describes how far from the result your network received from the expected result - this indicates the magnitude of the error that your model made on its prediciton.

Then you can accept this mistake and β€œreturn it” through your model, adjust its weight and bring it closer to the truth next time.

+5
source

Definition of the loss function: Let (X, A) be a measurable space, and YβŠ‚R be a closed subset. Then the function L: X Γ— Y Γ— R β†’ [0, ∞) is called the loss function, or simply loss, if it is measurable .

In the future, we will interpret L (x, y, f (x)) as the cost or loss predicting Y on f (x) if x is observed, i.e. the smaller the value of L (x, y, f (x)) , the better f (x) predicts the meaning of L. From this it becomes clear that constant loss functions, such as L: = 0 ,, are completely meaningless for ours because they do not distinguish between good and bad forecasts. Let's now recall from the introduction that our main goal is a small average loss for future invisible observations (x, y) .

0
source

Source: https://habr.com/ru/post/1265651/


All Articles