Nonsmooth and non-differentiable loss tensor function

  1. In a tensor flow, can you use a nonsmooth function as a loss function, for example, piecewise (or with if-else)? If you can not, why can you use ReLU?

  2. In this SLIM link , it says

"For example, we might want to minimize journal losses, but our interest metrics could be an F1 rating, or Union Crossing rating (which are not differentiable, and therefore cannot be used as losses)."

Does this mean “not differentiable” at all, for example, problems with the set? Because for ReLU at point 0 it is not differentiable.

  1. If you use such a custom loss function, do you need to implement the gradient yourself? Or can tenorflow do this automatically? I checked some custom loss functions, they did not implement the gradient for their loss function.
+11
source share
3 answers

The problem is not that the loss is piecewise or non-smooth. The problem is that we need a loss function that can send a non-zero gradient back to the network parameters (dloss / dparameter) when there is an error between the output and the expected output. This applies to almost any function used within the model (for example, loss function, activation function, attention function).

, Perceptron H (x) (H (x) = 1, x > 0 else 0). H (x) (undefined x = 0). , , ( ), . , , , sigmoid ( x).

Relu 1 x > 0 0 . undefined x = 0, - x > 0. .

, . , F1, ( undefined x), , , -, L2 L1 , . (, " " L1 x = 0, )

, , (, ).

+15

№ 3 OP, . Tensorflow , , !

+4
  1. tf , - . . , , .

  2. , , , -/ . - MATLAB. () .

function [s, ds] = QPWC_Neuron(z, sharp)
% A special case of (quadraple) piece-wise constant neuron composing of three Sigmoid functions
% There are three thresholds (junctures), 0.25, 0.5, and 0.75, respectively
% sharp determines how steep steps are between two junctures.
% The closer a point to one of junctures, the smaller its gradient will become. Gradients at junctures are zero.
% It deals with 1D signal only are present, and it must be preceded by another activation function, the output from which falls within [0, 1]
% Example:
% z = 0:0.001:1;
% sharp = 100;

LZ = length(z);
s = zeros(size(z));
ds = s;
for l = 1:LZ
    if z(l) <= 0
        s(l) = 0;
        ds(l) = 0;
    elseif (zl > 0) && (z(l) <= 0.25)
        s(l) = 0.25 ./ (1+exp(-sharp*((z(l)-0.125)./0.25)));
        ds(l) = sharp/0.25 * (s(l)-0) * (1-(s(l)-0)/0.25);
    elseif (z(l) > 0.25) && (z(l) <= 0.5)
        s(l) = 0.25 ./ (1+exp(-sharp*((z(l)-0.375)./0.25))) + 0.25;
        ds(l) = sharp/0.25 * (s(l)-0.25) * (1-(s(l)-0.25)/0.25);
    elseif (z(l) > 0.5) && (z(l) <= 0.75)
        s(l) = 0.25 ./ (1+exp(-sharp*((z(l)-0.625)./0.25))) + 0.5;
        ds(l) = sharp/0.25 * (s(l)-0.5) * (1-(s(l)-0.5)/0.25);
    elseif (z(l) > 0.75) && (z(l) < 1)
        % If z is larger than 0.75, the gradient shall be descended to it faster than other cases
        s(l) = 0.5 ./ (1+exp(-sharp*((z(l)-1)./0.5))) + 0.75;
        ds(l) = sharp/0.5 * (s(l)-0.75) * (1-(s(l)-0.75)/0.5);
    else
        s(l) = 1;
        ds(l) = 0;
    end
end
figure;
subplot 121, plot(z, s); xlim([0, 1]);grid on;
subplot 122, plot(z, ds); xlim([0, 1]);grid on;

end

enter image description here enter image description here enter image description here

  1. Python tf, @papaouf_ai. Python Tensorflow?
0
source

Source: https://habr.com/ru/post/1661639/


All Articles