I have the following data frame:
df=pd.DataFrame([[1,11,'a'],[2,12,'b'],[1,11,'c'],[3,12,'d'],[3,7,'e'],
[2,12,'f']])
df.columns=['id','code','name']
print(df)
id code name
0 1 11 a
1 2 12 b
2 1 11 c
3 3 12 d
4 3 7 e
5 2 12 f
For the above data block, I want to have only one column value "name" for any unique combination of columns idand code. For eq, the value namefor lines 0 and 2 should be the same. In addition, namefor lines 1 and 5 should also be the same.
id code name
0 1 11 a
1 2 12 b
2 1 11 a
3 3 12 d
4 3 7 e
5 2 12 b
Please let me know how this can be done programmatically. I have two operations with over 100,000 lines.
thank
source
share