I have this framework:
df = pd.DataFrame({'A' : ['foo', 'bar', 'foo', 'bar',
'fuz', 'baz', 'fuz', 'coo'],
'B' : ['one', 'one', 'two', 'two',
'three', 'three', 'four', 'one']})
It looks like this:
A B
0 foo one
1 bar one
2 foo two
3 bar two
4 fuz three
5 baz three
6 fuz four
7 coo one
I would like to create a new column group. The group combines combinations of unique values in columns A + B.
It considers unique values for each column. He then looks at the values in another column for items already in the group.
The result will look like this:
A B group
0 foo one 1
1 bar one 1
2 foo two 1
3 bar two 1
4 fuz three 2
5 baz three 2
6 fuz four 2
7 coo one 1
In this example, we start with fooin column A. Everyone foowill be at group1. Related values in B oneand two=> also in group1.
Corresponding values oneand twoare in column A foo, barand coo=> also group1.
The same principle gives us group2.
What would be the best way to do this?