I am trying to turn a column of rows into integer identifiers ... and I cannot find an elegant way to do this in pandas (or python). In the following example, I convert "A", which is a column / variable of rows, to numbers through matching, but it looks like a dirty hack for me.
import pandas as pd
import numpy as np
df = pd.DataFrame({'A': ['homer_simpson', 'mean_street', 'homer_simpson', 'bla_bla'], 'B': 4})
unique = df['A'].unique()
mapping = dict(zip(unique, np.arange(len(unique))))
new_df = df.replace({'A': mapping})
Is there a better, more direct way to achieve this?
source
share