I have huge datasets that contain over a million rows and have some specific attributes. I need to filter data while maintaining other properties.
My data is as follows:
ID Prop1 Prop2 TotalProp
56891940 G02 G02 2
56892558 A61 G02 4
56892558 A61 A61 4
56892558 G02 A61 4
56892558 A61 A61 4
56892552 B61 B61 3
56892552 B61 B61 3
56892552 B61 A61 3
56892559 B61 G61 3
56892559 B61 B61 3
56892559 B61 B61 3 and so on more than million rows
What I need, I need to delete lines if all line IDs have 56891940 and 56892559, which have βprop1β and βprop2β the same, but not 56892558 and 56892559, because some lines are the same, but at least one of its properties different therefore I want to keep all values ββfrom 56892558,56892552 and 56892559 and so on.
My end result should look like this:
ID Prop1 Prop2 TotalProp
56892558 A61 G02 4
56892558 A61 A61 4
56892558 G02 A61 4
56892558 A61 A61 4
56892552 B61 B61 3
56892552 B61 B61 3
56892552 B61 A61 3
56892559 B61 G61 3
56892559 B61 C61 3
56892559 B61 B61 3
source
share