Find the difference in the value of one row with the rest of the row values ​​in python pandas dataframe

Initial Dataframe:

df = 
             Index   Nature  Interval
0            0       1       0.000000
1            1       1       0.999627
2            2       1       1.000607
3            3       1       1.000612

The total number of entries is about 700,000.

Is there a way to find the difference between one element in the Interval column with all the other elements in the same column, and the same thing needs to be done for the remaining frame.

I have found a workaround for this problem. Fragment

df["Potential"] = df["Interval"].apply(lambda x:print(np.sum([math.exp(-4 * abs(x - val)) for val in df['Interval']])))

However, this takes too much time, simply because of the use of the for loop.

So, is there a way to optimize the solution.

+4
source share
1 answer

You can use apply:

b = df["Interval"].apply(lambda x: np.sum(np.exp(-4 * (x - df.Interval).abs())))
print (b)
0    1.054885
1    3.010498
2    3.014339
3    3.014319
Name: Interval, dtype: float64

Numpy Intrval abs, np.exp np.sum:

val = df.Interval.values
arr = np.sum(np.exp(-4*abs(val-val.reshape(len(df.index),-1))), axis=0)
print (arr)
[ 1.05488507  3.01049841  3.0143389   3.01431861]

df["Potential"] = arr
print (df)
   Index  Nature  Interval  Potential
0      0       1  0.000000   1.054885
1      1       1  0.999627   3.010498
2      2       1  1.000607   3.014339
3      3       1  1.000612   3.014319

, piRSquared:

i = df.Interval.values
print (np.exp((np.abs(i[:, None] - i)) * -4).sum(1))
[ 1.05488507  3.01049841  3.0143389   3.01431861]
+5

Source: https://habr.com/ru/post/1670151/


All Articles