When I drop John as a duplicate specifying "name" as the column name:
import pandas as pd data = {'name':['Bill','Steve','John','John','John'], 'age':[21,28,22,30,29]} df = pd.DataFrame(data) df = df.drop_duplicates('name')
pandas removes all matching objects, leaving the leftmost:
age name 0 21 Bill 1 28 Steve 2 22 John
Instead, I would like to keep the line where John is the highest age (in this example, this is age 30. How to achieve this?
source share