Calculation of daily units on pandas with a custom function

I am new to Pandas, and as an exercise, I am moving old code / solutions to learn from it. In this case, I am trying to calculate a synthetic index for the prices of a product that was previously calculated in SQL.

This is the data that I have in the data area:

    id  weight      date        price
0   1   0.002796    2005-11-15  0.998298
1   1   0.002796    2005-11-16  1.014242
2   1   0.002796    2005-11-17  1.016452
3   1   0.002796    2005-11-18  1.026396
4   1   0.002796    2005-11-19  1.026047
5   1   0.002796    2005-11-20  1.024285
6   1   0.002796    2005-11-21  1.018764
7   1   0.002796    2005-11-22  1.033175
8   1   0.002796    2005-11-23  1.058509
9   1   0.002796    2005-11-24  1.061231
10  1   0.002796    2005-11-25  1.058137
11  1   0.002796    2005-11-26  0.999380
12  1   0.002796    2005-11-27  0.990504
13  1   0.002796    2005-11-28  0.993764
14  1   0.002796    2005-11-29  0.978754
15  1   0.002796    2005-11-30  0.992070
...     ...     ...     ...     ...
4085    1   0.002796    2017-01-21  0.857420
4086    1   0.002796    2017-01-22  0.848195
4087    1   0.002796    2017-01-23  0.791784
4088    1   0.002796    2017-01-24  0.846603
4089    1   0.002796    2017-01-25  0.878104
4090    1   0.002796    2017-01-26  0.806651
4091    1   0.002796    2017-01-27  0.849316
4092    1   0.002796    2017-01-28  0.826550
4093    1   0.002796    2017-01-29  0.848651
4094    1   0.002796    2017-01-30  0.829643
4095    1   0.002796    2017-01-31  0.837094
4096    1   0.002796    2017-02-01  0.846572
4097    1   0.002796    2017-02-02  0.800163
4098    1   0.002796    2017-02-03  0.820356
4099    1   0.002796    2017-02-04  0.818924
4100    1   0.002796    2017-02-05  0.822157
4101    1   0.002796    2017-02-06  0.787123
4102    1   0.002796    2017-02-07  0.796264
4103    1   0.002796    2017-02-08  0.797241
4104    1   0.002796    2017-02-09  0.818499
4105    1   0.002796    2017-02-10  0.810928

The synthetic index is calculated with daily returns that are the return of the day:

Rt = (Price_day / Price_day_before) - 1

I read about Pandas, time series, etc., but I'm struggling to figure out the specific operation to perform here; is it rolling? How to get data for a given date and date before?

+4
source share
3 answers

IIUC pct_change():

In [196]: x
Out[196]:
    id    weight        date     price
0    1  0.002796  2005-11-15  0.998298
1    1  0.002796  2005-11-16  1.014242
2    1  0.002796  2005-11-17  1.016452
3    1  0.002796  2005-11-18  1.026396
4    1  0.002796  2005-11-19  1.026047
5    1  0.002796  2005-11-20  1.024285
6    1  0.002796  2005-11-21  1.018764
7    1  0.002796  2005-11-22  1.033175
8    1  0.002796  2005-11-23  1.058509
9    1  0.002796  2005-11-24  1.061231
10   1  0.002796  2005-11-25  1.058137
11   1  0.002796  2005-11-26  0.999380
12   1  0.002796  2005-11-27  0.990504
13   1  0.002796  2005-11-28  0.993764
14   1  0.002796  2005-11-29  0.978754
15   1  0.002796  2005-11-30  0.992070

In [197]: x['Rt'] = x['price'].pct_change()

In [198]: x
Out[198]:
    id    weight        date     price        Rt
0    1  0.002796  2005-11-15  0.998298       NaN
1    1  0.002796  2005-11-16  1.014242  0.015971
2    1  0.002796  2005-11-17  1.016452  0.002179
3    1  0.002796  2005-11-18  1.026396  0.009783
4    1  0.002796  2005-11-19  1.026047 -0.000340
5    1  0.002796  2005-11-20  1.024285 -0.001717
6    1  0.002796  2005-11-21  1.018764 -0.005390
7    1  0.002796  2005-11-22  1.033175  0.014146
8    1  0.002796  2005-11-23  1.058509  0.024521
9    1  0.002796  2005-11-24  1.061231  0.002572
10   1  0.002796  2005-11-25  1.058137 -0.002915
11   1  0.002796  2005-11-26  0.999380 -0.055529
12   1  0.002796  2005-11-27  0.990504 -0.008882
13   1  0.002796  2005-11-28  0.993764  0.003291
14   1  0.002796  2005-11-29  0.978754 -0.015104
15   1  0.002796  2005-11-30  0.992070  0.013605

( shift()):

In [199]: x['Rt2'] = x['price'] / x['price'].shift() - 1

In [200]: x
Out[200]:
    id    weight        date     price        Rt       Rt2
0    1  0.002796  2005-11-15  0.998298       NaN       NaN
1    1  0.002796  2005-11-16  1.014242  0.015971  0.015971
2    1  0.002796  2005-11-17  1.016452  0.002179  0.002179
3    1  0.002796  2005-11-18  1.026396  0.009783  0.009783
4    1  0.002796  2005-11-19  1.026047 -0.000340 -0.000340
5    1  0.002796  2005-11-20  1.024285 -0.001717 -0.001717
6    1  0.002796  2005-11-21  1.018764 -0.005390 -0.005390
7    1  0.002796  2005-11-22  1.033175  0.014146  0.014146
8    1  0.002796  2005-11-23  1.058509  0.024521  0.024521
9    1  0.002796  2005-11-24  1.061231  0.002572  0.002572
10   1  0.002796  2005-11-25  1.058137 -0.002915 -0.002915
11   1  0.002796  2005-11-26  0.999380 -0.055529 -0.055529
12   1  0.002796  2005-11-27  0.990504 -0.008882 -0.008882
13   1  0.002796  2005-11-28  0.993764  0.003291  0.003291
14   1  0.002796  2005-11-29  0.978754 -0.015104 -0.015104
15   1  0.002796  2005-11-30  0.992070  0.013605  0.013605
+2

:

df['return'] = df['price']/df['price'].shift(1) - 1

df.shift(1) python LEAD() LEG() SQL.

+3

To create a synthetic index starting at 100

df['PrcRtnIdx'] = df.price.pct_change().fillna(0).add(1).cumprod().mul(100)

print(df)

    id    weight        date     price   PrcRtnIdx
0    1  0.002796  2005-11-15  0.998298  100.000000
1    1  0.002796  2005-11-16  1.014242  101.597118
2    1  0.002796  2005-11-17  1.016452  101.818495
3    1  0.002796  2005-11-18  1.026396  102.814590
4    1  0.002796  2005-11-19  1.026047  102.779631
5    1  0.002796  2005-11-20  1.024285  102.603131
6    1  0.002796  2005-11-21  1.018764  102.050089
7    1  0.002796  2005-11-22  1.033175  103.493646
8    1  0.002796  2005-11-23  1.058509  106.031365
9    1  0.002796  2005-11-24  1.061231  106.304029
10   1  0.002796  2005-11-25  1.058137  105.994102
11   1  0.002796  2005-11-26  0.999380  100.108384
12   1  0.002796  2005-11-27  0.990504   99.219271
13   1  0.002796  2005-11-28  0.993764   99.545827
14   1  0.002796  2005-11-29  0.978754   98.042268
15   1  0.002796  2005-11-30  0.992070   99.376138

response to @mydaemon

df['PrcRtnIdx'] = \
    df.groupby('id').price.apply(
        lambda x: x.pct_change().fillna(0).add(1).cumprod().mul(100))
+2
source

Source: https://habr.com/ru/post/1669488/


All Articles