Aggregation does not work when using lambda

I am trying to transfer part of my application from pandas to dask, and I hit the checkpoint when using the lamdba function in the group on dask DataFrame.

import dask.dataframe as dd

dask_df = dd.from_pandas(pandasDataFrame, npartitions=2)
dask_df = dask_df.groupby(
                        ['one', 'two', 'three', 'four'],
                        sort=False
                    ).agg({'AGE' : lambda x: x * x })

This code fails with the following error:

ValueError: unknown aggregate lambda

My lambda function is more complicated in my application than here, but the content of the lambda does not matter, the error is always the same. There is a very similar example in the documentation , so this should really work, I'm not sure what I am missing.

The same group works in pandas, but I need to improve its performance.

I am using dask 0.12.0 with python 3.5.

+4
source share

Source: https://habr.com/ru/post/1662136/


All Articles