This does the job efficiently: it uses argpartition, which finds the n largest in O (n), and then only sorts them.
values=df.values
n,m=df.shape
k=4
I,J=mgrid[:n,:m]
I=I[:,:1]
if k<m: J=(-values).argpartition(k)[:,:k]
values=values[I,J]
names=np.take(df.columns,J)
J2=(-values).argsort()
names=names[I,J2]
values=values[I,J2]
names_and_values=np.empty((n,2*k),object)
names_and_values[:,0::2]=names
names_and_values[:,1::2]=values
result=pd.DataFrame(names_and_values)
For
0 1 2 3 4 5
0 A 7 C 5 B 2
1 B 3 A 3 C 1
2 C 6 B 2 D 1
3 D 9 B 6 A 3
source
share