Replace mutate (dplyr package) in python pandas

Is there a function similar to mutate (dplyr) with which I can add a new column for grouped data by applying a function to one of the columns of grouped data? The following is a detailed explanation of the problem:

I have generated data samples using the following code

x<- data.frame(country = rep(c("US", "UK"), 5), state = c(letters[1:10]), pop=sample(10000:50000,10))

Now I want to add a new column with the maximum number for the USA and Great Britain. I can do this using the following function in R

x<- group_by(x, country)
x<- mutate(x,max_pop = max(pop))
x<- arrange(x, country)

So my question is: how to do this in Python using pandas. I tried to follow but it did not work

x['max_pop'] = x.groupby('country').pop.apply(max)
+4
source share
1 answer

transform. transform , , , .

x['max_pop'] = x.groupby('country').pop.transform('max')

import pandas as pd 

x = pd.DataFrame(dict(
    country=['US','UK','US','UK'],
    state=['a','b','c','d'],
    pop=[37088, 46987, 17116, 20484]
))
+2

Source: https://habr.com/ru/post/1663765/


All Articles