Dask Dataframe groupby does not have len ()

If you have a groupby object based on a dask dataframe, why does it len(<groupby object>)return an error? (error or function)

+4
source share
1 answer

It is simply not implemented. You might want to raise a problem (or, even better, a transfer request). Pragmatically, I would simply type nuniquefor your grouping object

Before

g = df.groupby(df.x + df.y)
result = len(g)

After

result = (df.x + df.y).nunique()

Operationally, this is better because it can be lazy (the result lenin Python must be a specific integer) and because you can choose an option nunique_approxthat will be much faster.

+3
source

Source: https://habr.com/ru/post/1693476/


All Articles