I am trying to convert a categorical value and group in pandas.
For example, I tried the following:
import pandas as pd
df = pd.DataFrame()
df['A'] = ['C1', 'C1', 'C2', 'C2', 'C3', 'C3']
df['B'] = [1,2,3,4,5,6]
df['A'] = df.loc[:,'A'].astype('category')
df2 = df[0:3]
result = df2.groupby(by='A')['B'].nunique()
print(result)
Sorry, I get an exception
File "C: \ Python34 \ lib \ site-packages \ pandas \ core \ internals.py", line 86, in init len (self.values), len (self.mgr_locs)))
ValueError: wrong number of elements passed 2, allocation implies 3
Edit
Unfortunately, the workaround suggested by @joris does not work for my application. New counterexample:
import pandas as pd
df = pd.DataFrame()
df['A'] = ['C1', 'C1', 'C2', pd.np.nan, 'C3', 'C3']
df['B'] = [1,2,3,4,5,6]
df['A'] = df.loc[:,'A'].astype('category')
df2 = df[0:4]
df2['A'] = df2['A'].cat.remove_unused_categories()
result = df2.groupby(by='A')['B'].nunique()
print(result)