Multiplying Dataframe Strings with numpy Array

Question

Multiplying Dataframe Strings with numpy Array

I have a DataFrame that looks like this:

         Date   Last  portfolioID FinancialInstrument
1   2018-03-28  64.67            1                 Oil
2   2018-03-29  64.91            1                 Oil
3   2018-04-02  62.85            1                 Oil
4   2018-04-03  63.57            1                 Oil
5   2018-04-04  63.56            1                 Oil
6   2018-04-05  63.73            1                 Oil
7   2018-04-06  61.93            1                 Oil
8   2018-03-23  65.74            3                 Oil
9   2018-03-26  65.49            3                 Oil
10  2018-03-27  64.67            3                 Oil
11  2018-03-28  64.67            3                 Oil
12  2018-03-29  64.91            3                 Oil
13  2018-04-02  62.85            3                 Oil
14  2018-04-03  63.57            3                 Oil
15  2018-04-04  63.56            3                 Oil
16  2018-04-05  63.73            3                 Oil
17  2018-04-06  61.93            3                 Oil
18  2018-04-02  62.85            5                 Oil
19  2018-04-03  63.57            5                 Oil
20  2018-04-04  63.56            5                 Oil
21  2018-04-05  63.73            5                 Oil
22  2018-04-06  61.93            5                 Oil

and a NumPy array that looks like this:

[ 152.69506795   76.05719501  127.28719173]

I group a DataFrame with portfolioID, where the first group correlates with the first value in the NumPy array and the second group with the second value in the NumPy array, etc. My question is: is there a way I can multiply a column Lastin a DataFrame with its corresponding NumPy array value?

This is what I have, but I get the error "Length must be equal." shares- NumPy array:

for pid, group in data.groupby('portfolioID'):
    lastCol = group.Last
    clumN = lastCol.multiply(shares, axis=0)

+4

python arrays numpy pandas

pa0011 Apr 9 '18 at 3:14

source share

2 answers

miradulo · Answer 1 · 2018-04-09T03:38:37+0000

pandas.Series.factorize, , ,

val_arr = np.array([152.69506795, 76.05719501, 127.28719173])

df.Last * val_arr[df.portfolioID.factorize()[0]]

# 1     9874.790044
# 2     9911.436861
# 3     9596.885021
# 4     9706.825470
# 5     9705.298519
# 6     9731.256680
# 7     9456.405558
# 8     5000.000000
# 9     4980.985701
# 10    4918.618801
# 11    4918.618801
# 12    4936.872528
# 13    4780.194706
# 14    4834.955887
# 15    4834.195315
# 16    4847.125038
# 17    4710.222087
# 18    8000.000000
# 19    8091.646778
# 20    8090.373906
# 21    8112.012729
# 22    7882.895784
# Name: Last, dtype: float64

Tai · Answer 2 · 2018-04-09T03:55:45+0000

df count arr np.repeat.

arr = np.array([152.69506795, 76.05719501, 127.28719173])
df.Last * np.repeat(arr, df.groupby("portfolioID")["Last"].count())

Multiplying Dataframe Strings with numpy Array

More articles: