NaN from sparse_softmax_cross_entropy_with_logits in Tensorflow

I get NaN when I try to use the loss function sparse_softmax_cross_entropy_with_logits in a tensor stream. I have a simple network, something like:

layer = tf.nn.relu(tf.matmul(inputs, W1) + b1)
layer = tf.nn.relu(tf.matmul(layer, W2) + b2)
logits = tf.matmul(inputs, W3) + b3
loss = tf.sparse_softmax_cross_entropy_with_logits(logits, labels)

I have many classes (~ 10000), so I imagine that I get NaN, because the logs corresponding to the correct class, at least in one of my examples, are truncated to zero. Is there any way to avoid this?

+6
source share
3 answers

In fact, it turns out that some of my shortcuts were out of range (for example, label 14000, when my logic matrix is ​​only 150 x 10000). It turns out that this leads to NaN, and not to an error.

+9

tf.sparse_softmax_cross_entropy_with_logits log(0) , .

a NaN . , NaN ,

+4

, NaN , 0, , (0) -.

, , softmax, .

out = tf.clip_by_value(out,1e-10,100.0)

, :

out = out + 1e-10

, softmax sparse_softmax_cross_entropy_with_logits(), .

, - 1e-10 softmax, .

loss = -tf.reduce_sum(labels*tf.log(tf.nn.softmax(logits) + 1e-10))

Keep in mind that using the function, the sparse_softmax_cross_entropy_with_logits()variable labelswas the numeric value of the label, but if you yourself are losing cross-entropy, there labelsmust be a one-time encoding of these numeric labels.

Update: I fixed the answer thanks to @mdaoust's comment . According to him, zeros matter only after the softmax function was applied to the logs, and not earlier.

0
source

Source: https://habr.com/ru/post/1655131/


All Articles