In pandas, how can I add a new column that lists rows based on a given grouping?
For example, suppose the following DataFrame:
import pandas as pd import numpy as np a_list = ['A', 'B', 'C', 'A', 'A', 'C', 'B', 'B', 'A', 'C'] df = pd.DataFrame({'col_a': a_list, 'col_b': range(10)}) df col_a col_b 0 A 0 1 B 1 2 C 2 3 A 3 4 A 4 5 C 5 6 B 6 7 B 7 8 A 8 9 C 9
I would add col_c , which gives me the N-th row of the "group" on the basis of grouping col_a and sorting col_b .
Required Conclusion:
col_a col_b col_c 0 A 0 1 3 A 3 2 4 A 4 3 8 A 8 4 1 B 1 1 6 B 6 2 7 B 7 3 2 C 2 1 5 C 5 2 9 C 9 3
I am trying to get to col_c . You can proceed to the correct grouping and sorting with .sort_index(by=['col_a', 'col_b']) , now it is a matter of moving to this new column and labeling each row.
source share