If I have the following data file
| id | timestamp | code | id2
| 10 | 2017-07-12 13:37:00 | 206 | a1
| 10 | 2017-07-12 13:40:00 | 206 | a1
| 10 | 2017-07-12 13:55:00 | 206 | a1
| 10 | 2017-07-12 19:00:00 | 206 | a2
| 11 | 2017-07-12 13:37:00 | 206 | a1
...
I need to group the columns id, id2and get the first occurrence of the value timestamp, for example. for id=10, id2=a1, timestamp=2017-07-12 13:37:00.
I searched for it and found some possible solutions, but I can’t figure out how to implement them correctly. It should probably be something like this:
df.groupby(["id", "id2"])["timestamp"].apply(lambda x: ....)
source
share