Returns of multiple stocks using pandas at once

I have close prices for several stocks over the course of several days in the form of data like this.

In [67]: df
Out[67]:
          Date Symbol   Close
0   12/30/2016   AMZN  749.87
1   12/29/2016   AMZN  765.15
2   12/28/2016   AMZN  772.13
3   12/27/2016   AMZN  771.40
4   12/30/2016  GOOGL  792.45
5   12/29/2016  GOOGL  802.88
6   12/28/2016  GOOGL  804.57
7   12/27/2016  GOOGL  805.80
8   12/30/2016   NFLX  123.80
9   12/29/2016   NFLX  125.33
10  12/28/2016   NFLX  125.89
11  12/27/2016   NFLX  128.35

I would like to calculate the daily income of these stocks using pandas. The result should look like this:

         Date Symbol     Return
0   12/27/2016   AMZN       NaN
1   12/28/2016   AMZN  0.000946
2   12/29/2016   AMZN -0.009040
3   12/30/2016   AMZN -0.019970
4   12/27/2016  GOOGL       NaN
5   12/28/2016  GOOGL -0.001526
6   12/29/2016  GOOGL -0.002101
7   12/30/2016  GOOGL -0.012991
8   12/27/2016   NFLX       NaN
9   12/28/2016   NFLX -0.019166
10  12/29/2016   NFLX -0.004448
11  12/30/2016   NFLX -0.012208

I got the above output using the following code, but I feel this can be simplified further.

In [70]: rtn = df.pivot("Date", "Symbol", "Close").pct_change().reset_index()
In [73]: pd.melt(rtn, id_vars='Date', value_vars=list(rtn.columns[1:]),var_name='Symbol',value_name='Return')
+4
source share
2 answers

You can use first sort_valuesand then groupbywith DataFrameGroupBy.pct_change:

df = df.sort_values(['Symbol','Date']).reset_index(drop=True)
df['Return'] = df.groupby('Symbol')['Close'].pct_change()
print (df)
          Date Symbol   Close    Return
0   12/27/2016   AMZN  771.40       NaN
1   12/28/2016   AMZN  772.13  0.000946
2   12/29/2016   AMZN  765.15 -0.009040
3   12/30/2016   AMZN  749.87 -0.019970
4   12/27/2016  GOOGL  805.80       NaN
5   12/28/2016  GOOGL  804.57 -0.001526
6   12/29/2016  GOOGL  802.88 -0.002101
7   12/30/2016  GOOGL  792.45 -0.012991
8   12/27/2016   NFLX  128.35       NaN
9   12/28/2016   NFLX  125.89 -0.019166
10  12/29/2016   NFLX  125.33 -0.004448
11  12/30/2016   NFLX  123.80 -0.012208
+4
source

You can set_index, and unstackthat sort of your code for you, and then pct_change, and stackback.

print(
    df.set_index(['Date', 'Symbol'])
      .Close.unstack().pct_change()
      .stack(dropna=False).reset_index(name='Return')
      .sort_values(['Symbol', 'Date'])
      .reset_index(drop=True)
)


         Date Symbol    Return
0  2016-12-27   AMZN       NaN
1  2016-12-28   AMZN  0.000946
2  2016-12-29   AMZN -0.009040
3  2016-12-30   AMZN -0.019970
4  2016-12-27  GOOGL       NaN
5  2016-12-28  GOOGL -0.001526
6  2016-12-29  GOOGL -0.002101
7  2016-12-30  GOOGL -0.012991
8  2016-12-27   NFLX       NaN
9  2016-12-28   NFLX -0.019166
10 2016-12-29   NFLX -0.004448
11 2016-12-30   NFLX -0.012208
+1
source

Source: https://habr.com/ru/post/1665493/


All Articles