How to rotate pandas dataframe

Let's say we have a DataFrame that looks like this:

day_of_week ice_cream count proportion 0 Friday vanilla 638 0.094473 1 Friday chocolate 2048 0.663506 2 Friday strawberry 4088 0.251021 3 Monday vanilla 448 0.079736 4 Monday chocolate 2332 0.691437 5 Monday strawberry 441 0.228828 6 Saturday vanilla 24 0.073350 7 Saturday chocolate 244 0.712930 ... ... 

I want the new DataFrame to collapse on day_of_week as an index, so it looks like this:

  day_of_week vanilla chocolate strawberry 0 Friday 0.094473 0.663506 0.251021 1 Monday 0.079736 0.691437 0.228828 2 Saturday ... ... ... 

What is the cleanest way I can implement this?

+5
source share
4 answers

df.pivot_table is the right solution:

 In[31]: df.pivot_table(values='proportion', index='day_of_week', columns='ice_cream').reset_index() Out[31]: ice_cream day_of_week chocolate strawberry vanilla 0 Friday 0.663506 0.251021 0.094473 1 Monday 0.691437 0.228828 0.079736 2 Saturday 0.712930 NaN 0.073350 

If you leave reset_index() , it will actually return an indexed data frame, which may be more useful to you.

Note that the pivot table necessarily performs dimensionality reduction when the values column is not a function of the tuple (index, columns) . If several pairs (index, columns) with different pivot_table bring the dimension to unity using the aggregation function, by default mean .

+4
source

You are looking for pivot_table

 df = pd.pivot_table(df, index='day_of_week', columns='ice_cream', values = 'proportion') 

You are getting:

 ice_cream chocolate strawberry vanilla day_of_week Friday 0.663506 0.251021 0.094473 Monday 0.691437 0.228828 0.079736 Saturday 0.712930 NaN 0.073350 
+2
source

Use the pivot table:

 import pandas as pd import numpy as np df = pd.DataFrame({'day_of_week':['Friday','Sunday','Monday','Sunday','Friday','Friday'], \ 'count':[200,300,100,50,110,90], 'ice_cream':['choco','vanilla','vanilla','choco','choco','straw'],\ 'proportion':[.9,.1,.2,.3,.8,.4]}) print df # If you like replace np.nan with zero tab = pd.pivot_table(df,index='day_of_week',columns='ice_cream', values=['proportion'],fill_value=np.nan) print tab 

Output:

  count day_of_week ice_cream proportion 0 200 Friday choco 0.9 1 300 Sunday vanilla 0.1 2 100 Monday vanilla 0.2 3 50 Sunday choco 0.3 4 110 Friday choco 0.8 5 90 Friday straw 0.4 proportion ice_cream choco straw vanilla day_of_week Friday 0.85 0.4 NaN Monday NaN NaN 0.2 Sunday 0.30 NaN 0.1 
+1
source

Using set_index and unstack

 df.set_index(['day_of_week', 'ice_cream']).proportion.unstack() \ .reset_index().rename_axis([None], 1) day_of_week chocolate strawberry vanilla 0 Friday 0.663506 0.251021 0.094473 1 Monday 0.691437 0.228828 0.079736 2 Saturday 0.712930 NaN 0.073350 

time vs pivot_table

enter image description here

+1
source

Source: https://habr.com/ru/post/1264656/


All Articles