Pandas Expand Collapse

Hi and thanks in advance for any help. I have a simple two-column framework. I did not explicitly specify the index, but I believe that an integer index appears in the dataframe, which I see on the left side of the output. Question below:

df = pandas.DataFrame(res)
df.columns = ['date', 'pb']
df['date'] = pandas.to_datetime(df['date'])
df.dtypes

date    datetime64[ns]
pb             float64
dtype: object


date    pb
0   2016-04-01  24199.933333
1   2016-03-01  23860.870968
2   2016-02-01  23862.275862
3   2016-01-01  25049.193548
4   2015-12-01  24882.419355
5   2015-11-01  24577.000000


date    datetime64[ns]
pb             float64
dtype: object

I would like to rotate the data frame so that I have years at the top (columns): 2016, 2015, etc. and a line for each month: 1 - 12.

+4
source share
2 answers

Using .dt accessor, you can create columns throughout the year and month, and then rotate on them:

df['Year'] = df['date'].dt.year
df['Month'] = df['date'].dt.month
pd.pivot_table(df,index='Month',columns='Year',values='pb',aggfunc=np.sum)

Alternatively, if you do not want these other columns you can do:

pd.pivot_table(df,index=df['date'].dt.month,columns=df['date'].dt.year,
               values='pb',aggfunc=np.sum)

With my dummy dataset that produces:

Year             2013      2014      2015      2016
date
1             92924.0  102072.0  134660.0  132464.0
2             79935.0   82145.0  118234.0  147523.0
3             86878.0   94959.0  130520.0  138325.0
4             80267.0   89394.0  120739.0  129002.0
5             79283.0   91205.0  118904.0  125878.0
6             77828.0   89884.0  112488.0  121953.0
7             78839.0   94407.0  113124.0       NaN
8             79885.0   97513.0  116771.0       NaN
9             79455.0   99555.0  114833.0       NaN
10            77616.0   98764.0  115872.0       NaN
11            75043.0   95756.0  107123.0       NaN
12            81996.0  102637.0  114952.0       NaN
+6

stack pivot

df = pd.DataFrame(
         dict(date=pd.date_range('2013-01-01', periods=42, freq='M'),
              pb=np.random.rand(42)))

df.set_index([df.date.dt.month, df.date.dt.year]).pb.unstack()

enter image description here

+2

Source: https://habr.com/ru/post/1649329/


All Articles