Pandas dataframe: select max by column for a subset

I am new to pandas and go around in circles trying to find an easy way to solve the following problem:

I have a large correlation matrix (several thousand rows / columns) as a data frame and I would like to extract the maximum value for the column, excluding "1", which, of course, is present in all columns (matrix diagonal).

Tried all kinds of options .max (). Imax (), including the following:

corr.drop(corr.idxmax()).max()

But get only meaningless results. Any help is appreciated.

+4
source share
2 answers

Maybe you can use np.fill_diagonal

df_values=df.values.copy()
np.fill_diagonal(df_values,-np.inf)
df_values.max(0)

Or with one layer that you can use:

df.values[~np.eye(df.shape[0],dtype=bool)].reshape(df.shape[0]-1,-1).max(0)
+2
source

2- .

:

np.partition(df.values, len(df)-2, axis=0)[len(df)-2]

:

pd.DataFrame(np.partition(df.values, len(df)-2, axis=0)[len(df)-2],
             index=df.columns, columns=['2nd'])
0

Source: https://habr.com/ru/post/1693151/


All Articles