Rotate pandas DataFrame - AssertionError: index length does not match values

I have a pandas.DataFrame that will not change as I expect. While pivot_table is ordering everything correctly, the fact that it uses aggregated functions to get a trip. Also, pivot_table seems to return an unnecessary complex object, rather than a flat data frame.

Consider the following example.

 import pandas as pd df = pd.DataFrame({'firstname':['Jon']*3+['Amy']*2, 'lastname':['Cho']*3+['Frond']*2, 'vehicle':['bike', 'car', 'plane','bike','plane'], 'weight':[81.003]*3+[65.6886]*2, 'speed':[29.022, 95.1144, 302.952, 27.101, 344.2],}) df.set_index(['firstname','lastname','weight']) print('------ Unnecessary pivot_table does averaging ------') print(pd.pivot_table(df, values='speed', rows='firstname','lastname','weight'], cols='vehicle')) print('------ pivot method dies ------') print(df.pivot( index=['firstname','lastname','weight'], columns='vehicle', values='speed')) 

pivot_table results:

 vehicle bike car plane firstname lastname weight Amy Frond 65.6886 27.101 NaN 344.200 Jon Cho 81.0030 29.022 95.1144 302.952 

Is there a way to get pivot to give essentially the same result as the pivot_table command (but hopefully flatter and tidier)? Otherwise, how to smooth the output of pivot_table ? What I want as a conclusion is something more:

 firstname lastname weight bike car plane Amy Frond 65.6886 27.101 NaN 344.200 Jon Cho 81.0030 29.022 95.1144 302.952 
+4
source share
1 answer

If you don't need pivot_table aggregation, you really need the pivot function. However, pivot does not work with providing multiple index columns (in fact, I don't know why). But there is a similar function for pivot, unstack , which works the same, but is based on a (multi) index instead of columns.

So, to use this, you can first set the desired columns as indexes / columns as an index:

 df2 = df.set_index(['firstname','lastname','weight', 'vehicle']) 

and then unfasten the last level (by default), so that "vehicle" (which becomes the column labels):

 In [3]: df2.unstack() Out[3]: speed vehicle bike car plane firstname lastname weight Amy Frond 65.6886 27.101 NaN 344.200 Jon Cho 81.0030 29.022 95.1144 302.952 

And if you do not need a multi-index, you can "smooth out" the result with reset_index .
The only possible problem you may encounter is that the columns also have two levels, so you can delete the first level first and then reset the index to become a really flat data framework:

 In [17]: df3 = df2.unstack() In [18]: df3.columns = df3.columns.droplevel(0) In [19]: df3.reset_index() Out[19]: vehicle firstname lastname weight bike car plane 0 Amy Frond 65.6886 27.101 NaN 344.200 1 Jon Cho 81.0030 29.022 95.1144 302.952 
+6
source

Source: https://habr.com/ru/post/1501551/


All Articles