Multiplying Dataframe Strings with numpy Array

I have a DataFrame that looks like this:

         Date   Last  portfolioID FinancialInstrument
1   2018-03-28  64.67            1                 Oil
2   2018-03-29  64.91            1                 Oil
3   2018-04-02  62.85            1                 Oil
4   2018-04-03  63.57            1                 Oil
5   2018-04-04  63.56            1                 Oil
6   2018-04-05  63.73            1                 Oil
7   2018-04-06  61.93            1                 Oil
8   2018-03-23  65.74            3                 Oil
9   2018-03-26  65.49            3                 Oil
10  2018-03-27  64.67            3                 Oil
11  2018-03-28  64.67            3                 Oil
12  2018-03-29  64.91            3                 Oil
13  2018-04-02  62.85            3                 Oil
14  2018-04-03  63.57            3                 Oil
15  2018-04-04  63.56            3                 Oil
16  2018-04-05  63.73            3                 Oil
17  2018-04-06  61.93            3                 Oil
18  2018-04-02  62.85            5                 Oil
19  2018-04-03  63.57            5                 Oil
20  2018-04-04  63.56            5                 Oil
21  2018-04-05  63.73            5                 Oil
22  2018-04-06  61.93            5                 Oil

and a NumPy array that looks like this:

[ 152.69506795   76.05719501  127.28719173]

I group a DataFrame with portfolioID, where the first group correlates with the first value in the NumPy array and the second group with the second value in the NumPy array, etc. My question is: is there a way I can multiply a column Lastin a DataFrame with its corresponding NumPy array value?

This is what I have, but I get the error "Length must be equal." shares- NumPy array:

for pid, group in data.groupby('portfolioID'):
    lastCol = group.Last
    clumN = lastCol.multiply(shares, axis=0)
+4
source share
2 answers

pandas.Series.factorize, , ,

val_arr = np.array([152.69506795, 76.05719501, 127.28719173])

df.Last * val_arr[df.portfolioID.factorize()[0]]

# 1     9874.790044
# 2     9911.436861
# 3     9596.885021
# 4     9706.825470
# 5     9705.298519
# 6     9731.256680
# 7     9456.405558
# 8     5000.000000
# 9     4980.985701
# 10    4918.618801
# 11    4918.618801
# 12    4936.872528
# 13    4780.194706
# 14    4834.955887
# 15    4834.195315
# 16    4847.125038
# 17    4710.222087
# 18    8000.000000
# 19    8091.646778
# 20    8090.373906
# 21    8112.012729
# 22    7882.895784
# Name: Last, dtype: float64
+4

df count arr np.repeat.

arr = np.array([152.69506795, 76.05719501, 127.28719173])
df.Last * np.repeat(arr, df.groupby("portfolioID")["Last"].count())
+1

Source: https://habr.com/ru/post/1695949/


All Articles