It seems you need to cross the DataFrame line by line.
cols = df.columns bt = df.apply(lambda x: x > 0) bt.apply(lambda x: list(cols[x.values]), axis=1)
and you will receive:
0 [c1, c2] 1 [c1] 2 [c2] 3 [c1] 4 [c2] 5 [] 6 [c2, c3, c4, c5, c6, c7, c9] 7 [c1, c2, c3, c6, c8, c9] 8 [c1, c2, c4, c5, c6, c7, c8, c9] 9 [c1, c2, c3, c4, c5, c6, c7, c8, c9] 10 [c1, c2, c4] 11 [c1, c2, c3, c5, c7, c8] dtype: object
If performance matters, try passing raw=True to boolean Create a DataFrame, as shown below:
%timeit df.apply(lambda x: x > 0, raw=True).apply(lambda x: list(cols[x.values]), axis=1) 1000 loops, best of 3: 812 Β΅s per loop
This gives you better performance. The following is the result of raw=False (which is the default):
%timeit df.apply(lambda x: x > 0).apply(lambda x: list(cols[x.values]), axis=1) 100 loops, best of 3: 2.59 ms per loop