Pandas groupby and create a set of elements

I am using pandas groupby and want to use this function to create a set of elements in a group.

The following does not work:

df = df.groupby('col1')['col2'].agg({'size': len, 'set': set})

But the following works:

def to_set(x):
    return set(x)

df = df.groupby('col1')['col2'].agg({'size': len, 'set': to_set})

In my understanding, the two expressions are similar, what is the reason why the first does not work?

+5
source share
2 answers

This is because it sethas type typethen it to_sethas function type:

type(set)
<class 'type'>

def to_set(x):
    return set(x)

type(to_set)

<class 'function'>

According to the documents , it .agg()expects:

arg: functionordict

Function for joining groups.

  • If function, should either work in the transfer DataFrameor in the transfer of DataFrame.apply.

  • If passed dict, keys must be DataFramecolumns DataFrame.

Allowed combinations:

  • string
  • function

  • list

  • dict

  • dict

+8

:

df = df.groupby('col1')['col2'].agg({'size': len, 'set': lambda x: set(x)})

.

0

Source: https://habr.com/ru/post/1689988/


All Articles