I have a data block containing weekly sales for different products (a, b, c). If during the week (for example, week 4) there was a zero number of sales, this week there is no record:
In[1]
df = pd.DataFrame({'product': list('aaaabbbbcccc'),
'week': [1, 2, 3, 5, 1, 2, 3, 5, 1, 2, 3, 4],
'sales': np.power(2, range(12))})
Out[1]
product sales week
0 a 1 1
1 a 2 2
2 a 4 3
3 a 8 5
4 b 16 1
5 b 32 2
6 b 64 3
7 b 128 5
8 c 256 1
9 c 512 2
10 c 1024 3
11 c 2048 4
I would like to create a new column containing cumulative sales for the previous n weeks, grouped by product. For example. for n = 2 it should be like last_2_weeks:
product sales week last_2_weeks
0 a 1 1 0
1 a 2 2 1
2 a 4 3 3
3 a 8 5 4
4 b 16 1 0
5 b 32 2 16
6 b 64 3 48
7 b 128 5 64
8 c 256 1 0
9 c 512 2 256
10 c 1024 3 768
11 c 2048 4 1536
If there was a record for every week, I could just use rolling_sum
it as described in this question .
Is there a way to set the "week" as an index and only calculate the amount by this index? Or can I redo the "week" and set the sales to zero for all missing lines?