Summarize categorical data in a Dask DataFrame

By default, the describeDask DataFrame method only sums numeric columns. According to the docs, I should be able to get descriptions of categorical columns by providing a parameter include. anyway

df.describe(include=['category']).compute()

leads to

TypeError: describe() got an unexpected keyword argument 'include'.

I also tried a slightly different approach:

df.select_dtypes(include=['category']).describe().compute()

and this time I get

ValueError: DataFrame contains only non-numeric data.

Could you please advise how best to summarize categorical columns in a Dask DataFrame?

+7
source share
1 answer

Sum only numeric or object columns

  1. To call description () for numeric columns only, use description (include = [np.number])
  2. description() () description (include = ['O']).

: ""

0

Source: https://habr.com/ru/post/1692636/


All Articles