I have a dataframe as follows:
d = {
'id': [1, 2, 3, 4, 5],
'is_overdue': [True, False, True, True, False],
'org': ['A81001', 'A81002', 'A81001', 'A81002', 'A81003']
}
df = pd.DataFrame(data=d)
Now I want to work for each organization, what percentage of rows are overdue and what percentage is not.
I know how to group by organization and expired status:
df.groupby(['org', 'is_overdue']).agg('count')
But how do I get a share in the organization? I want to get something like this:
org is_overdue not_overdue proportion_overdue
A81001 2 0 100
A81002 1 1 50
A81003 0 1 0
source
share