Label smoothing (soft targets) in Pandas

Question

Label smoothing (soft targets) in Pandas

There is a get_dummies method in get_dummies that encodes a categorical variable one-hot. Now I want to make label smoothing as described in section 7.5.1 of the Deep Learning book:

Label smoothing organizes the softmax-based model with k output values, replacing the hard 0 and 1 classification targets with eps / k and 1 - (k - 1) / k * eps targets, respectively.

What would be the most efficient and / or elegant way to make label smoothing in a Pandas dataframe?

0

python pandas machine-learning

Sergii Gryshkevych Sep 05 '16 at 17:50

source share

1 answer

lejlot · Accepted Answer · 2016-09-05T18:11:33+0000

First, let's use a much simpler equation ( ϵ denotes how much mass of probability you are moving from the “true label” and extending to everyone else).

 1 -> 1 - ϵ 0 -> ϵ / (k-1)

You can simply use the nice math property above, since all you have to do is

 x -> x * (1 - ϵ) + (1-x) * ϵ / (k-1)

this way if your dummy columns a, b, c, d just do

 indices = ['a', 'b', 'c', 'd'] eps = 0.1 df[indices] = df[indices] * (1 - eps) + (1-df[indices]) * eps / (len(indices) - 1)

which for

 >>> df abcd 0 1 0 0 0 1 0 1 0 0 2 0 0 0 1 3 1 0 0 0 4 0 1 0 0 5 0 0 1 0

returns

  abcd 0 0.900000 0.033333 0.033333 0.033333 1 0.033333 0.900000 0.033333 0.033333 2 0.033333 0.033333 0.033333 0.900000 3 0.900000 0.033333 0.033333 0.033333 4 0.033333 0.900000 0.033333 0.033333 5 0.033333 0.033333 0.900000 0.033333

as was expected.

Label smoothing (soft targets) in Pandas

More articles: