Some simple data to get started:
import pandas as pd import numpy as np df = pd.DataFrame({"x": np.random.normal(size=100), "y": np.random.normal(size=100)}) 
So, up to this point, I always thought assign was the equivalent of mutate in the dplyr library. However, if I try to use the variable that I created in the assign step in the same assign step, I get an error message. Consider the following acceptable in R:
 df %>% mutate(z = x * y, w = z + 10) 
If I try the equivalent in pandas , I get an error:
 df.assign(z = df.x * df.y, w = z + 10)  
The only way I can do this is to use the two steps of assign :
 df.assign(z = df.x * df.y).assign(w = lambda d: dz + 10) 
Is there something I missed? Or is there another function that is more suitable?
source share