I have a pandas dataframe as follows:
A B C
1 2 x
1 2 y
3 4 z
3 5 x
I want only 1 row of rows left that have the same values in certain columns. In the above example, I mean columns A and B. In other words, if the values of columns A and B occur more than once in the data frame, only one row remains (which does not matter).
FWIW: The maximum number of so-called repeating rows (that is, where the same columns are A and B) is 2.
The result should look like this:
A B C
1 2 x
3 4 z
3 5 x
or
A B C
1 2 y
3 4 z
3 5 x
source
share