I have a dataframe as shown below. The last column shows the sum of the values from all columns, i.e. A, B, D, KAnd T. Note that some of the columns have NaN.
word1,A,B,D,K,T,sum
na,,63.0,,,870.0,933.0
sva,,1.0,,3.0,695.0,699.0
a,,102.0,,1.0,493.0,596.0
sa,2.0,487.0,,2.0,15.0,506.0
su,1.0,44.0,,136.0,214.0,395.0
waw,1.0,9.0,,34.0,296.0,340.0
How can I calculate the entropy for each row? that is, I have to find something like the following
df['A']/df['sum']*log(df['A']/df['sum']) + df['B']/df['sum']*log(df['B']/df['sum']) + ...... + df['T']/df['sum']*log(df['T']/df['sum'])
The condition is that whenever the value inside logbecomes zeroor NaN, the integer value should be considered as zero (by definition, the log will return an error because log 0 is undefined).
. pandas, sum A, B, D .. CSV .