I have a pandas tv channel of vehicle coordinates (from several vehicles in a few days). For each car and for every day I do two things: either apply the algorithm to it, or completely filter it from the data set if it does not meet certain criteria.
To do this, I use df.groupby('vehicle_id', 'day') , and then .apply(algorithm) or .filter(condition) , where algorithm and condition are the functions that are taken in the data frame.
I would like the full processing of my dataset (which includes several .apply and .filter ) that should be written in a declarative style, as opposed to an imperative loop through groups, with the goal of just looking at something like:
df.group_by('vehicle_id', 'day').apply(algorithm1).filter(condition1).apply(algorithm2).filter(condition2)
Of course, the code above is incorrect, because .apply() and .filter() returning new data, and this is just my problem. They return all the data back to one data frame, and I find that I use .groupby('vehicle_id', 'day') continuously.
Is there a good way that I can write this without having to group the same columns over and over?
source share