I have a large data frame, in a column named currency and amount_in_euros , the currency column contains data such as EUR, GBR, etc., and amount_in_euros contains a floating point value. I want to calculate the amount of each currency (EUR, GBR, etc.) And put the maximum value of the currency in the new series. I have to calculate this operation for each client . How to do it in pandas.
Input:
Customer currency amount_in_euros
1 EUR 10
1 GBR 6
1 GBR 18
1 EUR 2
1 EUR 3
2 IND 12
.
.
.
Conclusion:
Customer currency amount_in_euros max
1 EUR 10 GBR
1 GBR 6 GBR
1 GBR 18 GBR
1 EUR 2 GBR
1 EUR 3 GBR
2 IND 12 IND
.
.
.
still i tried
df=pd.read_csv('analysis.csv')
res=pd.DataFrame()
for u,v in df.groupby(['Customer']):
temp= v[['currency','amount_in_euros']].groupby(['currency'])['amount_in_euros'].sum().reset_index().sort_values('amount_in_euros',ascending=False)
v['max']=temp['currency'].iloc[0]
res=res.append(v)
My above code works fine for me, but adding an operation takes a lot of time. please help me solve this problem. Thanks in advance.
source