One of the rules Pandas Zen says always try to find a vectorized solution first.
.apply(..., axis=1) not vectorized!
Consider the alternatives:
In [164]: df.prod(axis=1)
Out[164]:
0 0.770675
1 0.539782
2 0.318027
3 0.597172
4 0.211643
dtype: float64
In [165]: df[0] * df[1]
Out[165]:
0 0.770675
1 0.539782
2 0.318027
3 0.597172
4 0.211643
dtype: float64
Timing versus 50,000 lines of DF:
In [166]: df = pd.concat([df] * 10**4, ignore_index=True)
In [167]: df.shape
Out[167]: (50000, 2)
In [168]: %timeit df.apply(multiply, axis=1)
1 loop, best of 3: 6.12 s per loop
In [169]: %timeit df.prod(axis=1)
100 loops, best of 3: 6.23 ms per loop
In [170]: def multiply_vect(x1, x2):
...: return x1*x2
...:
In [171]: %timeit multiply_vect(df[0], df[1])
1000 loops, best of 3: 604 µs per loop
Conclusion: use .apply()as a last resort (i.e. when nothing helps)
source
share