As I mentioned earlier, this will give you a non-cumulative difference between the dates within each group:
df['days_since_last_event'] = df.groupby('group_ids')['dates'].diff().apply(lambda x: x.days)
To get the cumulative sum of this difference, depending on when it changes event_today_in_group, I suggest using the shiftprevious row to get the value, and then generating a cumulative sum, for example:
df['event_today_in_group'].shift().cumsum()
Conclusion:
0 NaN
1 1.0
2 1.0
3 2.0
4 3.0
5 4.0
, . , , groupby, :
df.loc[:, 'days_since_last_event'] = df.groupby(['group_ids', df['event_today_in_group'].shift().cumsum()])['days_since_last_event'].cumsum()
:
group_ids dates event_today_in_group days_since_last_event
0 1 2016-04-01 1 NaN
1 1 2016-04-20 0 19.0
2 1 2016-04-28 1 27.0
3 2 2016-04-05 1 NaN
4 2 2016-04-20 1 15.0
5 2 2016-04-29 0 9.0