- 1.
df['column1']=myFunction(df['column1'])
Here you define the function to be applied to pd.Series. You let pandas handle how this happens.
- 2.
df['column1']=df['column1'].apply(lambda x:myFunction[x])
Here you apply a function for each element.
1 , 2. , myFunction, .
:
100 000 ( , ) column1:
In [1]:
import pandas as pd
import numpy as np
df = pd.DataFrame(np.random.rand(100000,2),
columns=['column1','column2'])
def myFunction(s):
return s**2
In [2]: %%timeit
...: myFunction(df.column1)
...:
1000 loops, best of 3: 1.68 ms per loop
In [3]: %%timeit
...: df.column1.apply(lambda x: x**2)
...:
10 loops, best of 3: 55.4 ms per loop
, 30 , pd.Series, . , myFunction .
, myFunction , :
In [4]: def myFunction(s):
...: return s.apply(lambda x: x**2)
...:
In [4]: %%timeit
...: myFunction(df.column1)
...:
10 loops, best of 3: 53.9 ms per loop
, apply