Pandas: How to group and get the amount of unique data in this column?

I have a DataFrame that has three columns:

id     order     ordernumber  
1      app         1
1      pip         2
1      org         3
2      app         1
3      app         1
3      org         3

In the column "order" there are only 3 unique values ​​(application, pip and org). I would like to get a DataFrame that shows, for each id, how many orders each of them have, as well as the number of their total orders.

The result will look like this:

id     app        pip    org    total
1      1           1      1      3
2      1           0      0      1
3      1           0      1      2
+4
source share
2 answers

You can use pivot_tableto get the calculations:

>>> df2 = df.pivot_table(index='id', columns='order', aggfunc='size', fill_value=0)
>>> df2
order  app  org  pip
id
1        1    1    1
2        1    0    0
3        1    1    0

Then you can add the "total" column by summing each row:

>>> df2['total'] = df2.sum(axis=1)
>>> df2
order  app  org  pip  total
id
1        1    1    1      3
2        1    0    0      1
3        1    1    0      2
+2
source

Alternative ajcr:

df2 = df.pivot_table(index='id', columns='order', aggfunc=lambda x: len(x.unique()), margins=True)

aggfunc uniques.

In [4]: df2 = df.pivot_table(index='id', columns='order', aggfunc=lambda x: len(x.unique()), margins=True)

In [5]: df2
Out[5]:
      ordernum
order      app org pip All
id
1            1   1   1   3
2            1 NaN NaN   1
3            1   1 NaN   2
All          1   1   1   3

margins / pivot_table.

NaN , : df2.fillna(0, inplace=True)

In [6]: df2.fillna(0, inplace=True)

In [7]: df2
Out[7]:
      ordernum
order      app org pip All
id
1            1   1   1   3
2            1   0   0   1
3            1   1   0   2
All          1   1   1   3
0

Source: https://habr.com/ru/post/1624325/


All Articles