Elementary Maximum of two DataFrames ignoring NaNs

I have two data frames (df1 and df2), each of which has the same rows and columns. I would like to take the maximum of these two data, in stages. In addition, the result of any elementary maximum with a number and NaN should be a number. The approach that I have implemented so far seems inefficient:

def element_max(df1,df2): import pandas as pd cond = df1 >= df2 res = pd.DataFrame(index=df1.index, columns=df1.columns) res[(df1==df1)&(df2==df2)&(cond)] = df1[(df1==df1)&(df2==df2)&(cond)] res[(df1==df1)&(df2==df2)&(~cond)] = df2[(df1==df1)&(df2==df2)&(~cond)] res[(df1==df1)&(df2!=df2)&(~cond)] = df1[(df1==df1)&(df2!=df2)] res[(df1!=df1)&(df2==df2)&(~cond)] = df2[(df1!=df1)&(df2==df2)] return res 

Any other ideas? Thank you for your time.

+5
source share
1 answer

You can use where to check your df for another df where the condition is True , the values โ€‹โ€‹from df returned when false the values โ€‹โ€‹from df1 . In addition, in the case where the NaN values โ€‹โ€‹are in df1 , then an additional call to fillna(df) will use the values โ€‹โ€‹from df to fill these NaN and return the desired df:

 In [178]: df = pd.DataFrame(np.random.randn(5,3)) df.iloc[1,2] = np.NaN print(df) df1 = pd.DataFrame(np.random.randn(5,3)) df1.iloc[0,0] = np.NaN print(df1) 0 1 2 0 2.671118 1.412880 1.666041 1 -0.281660 1.187589 NaN 2 -0.067425 0.850808 1.461418 3 -0.447670 0.307405 1.038676 4 -0.130232 -0.171420 1.192321 0 1 2 0 NaN -0.244273 -1.963712 1 -0.043011 -1.588891 0.784695 2 1.094911 0.894044 -0.320710 3 -1.537153 0.558547 -0.317115 4 -1.713988 -0.736463 -1.030797 In [179]: df.where(df > df1, df1).fillna(df) Out[179]: 0 1 2 0 2.671118 1.412880 1.666041 1 -0.043011 1.187589 0.784695 2 1.094911 0.894044 1.461418 3 -0.447670 0.558547 1.038676 4 -0.130232 -0.171420 1.192321 
+10
source

Source: https://habr.com/ru/post/1233250/


All Articles