How to replace values ​​in several categorical elements in pandas DataFrame

I want to replace certain values ​​in a data frame containing several categorizations.

df = pd.DataFrame({'s1': ['a', 'b', 'c'], 's2': ['a', 'c', 'd']}, dtype='category')

If I apply .replacein one column, the result will be as expected:

>>> df.s1.replace('a', 1)
0    1
1    b
2    c
Name: s1, dtype: object

If I apply the same operation to the entire data frame, an error is displayed (short version):

>>> df.replace('a', 1)
ValueError: Cannot setitem on a Categorical with a new category, set the categories first

During handling of the above exception, another exception occurred:
ValueError: Wrong number of dimensions

If the data frame contains integers in the form of categories, the following is performed:

df = pd.DataFrame({'s1': [1, 2, 3], 's2': [1, 3, 4]}, dtype='category')

>>> df.replace(1, 3)
    s1  s2
0   3   3
1   2   3
2   3   4

But,

>>> df.replace(1, 2)
ValueError: Wrong number of dimensions

What am I missing?

+4
source share
2 answers

Without digging, it seems to me that this is bad.

My work around
pd.DataFrame.apply with help. pd.Series.replace
This has the advantage that you don’t have to bother with changing any type.

df = pd.DataFrame({'s1': [1, 2, 3], 's2': [1, 3, 4]}, dtype='category')
df.apply(pd.Series.replace, to_replace=1, value=2)

  s1  s2
0  2   2
1  2   3
2  3   4

or

df = pd.DataFrame({'s1': ['a', 'b', 'c'], 's2': ['a', 'c', 'd']}, dtype='category')
df.apply(pd.Series.replace, to_replace='a', value=1)

  s1 s2
0  1  1
1  b  c
2  c  d

@ cᴏʟᴅsᴘᴇᴇᴅ Work around

df = pd.DataFrame({'s1': ['a', 'b', 'c'], 's2': ['a', 'c', 'd']}, dtype='category')
df.applymap(str).replace('a', 1)

  s1 s2
0  1  1
1  b  c
2  c  d
+2

:

In [224]: df.s1.cat.categories
Out[224]: Index(['a', 'b', 'c'], dtype='object')

In [225]: df.s2.cat.categories
Out[225]: Index(['a', 'c', 'd'], dtype='object')

, , , :

In [226]: df.replace('d','a')
Out[226]:
  s1 s2
0  a  a
1  b  c
2  c  a

, :

pd.Categorical(..., categories=[...])

...

+2

Source: https://habr.com/ru/post/1693691/


All Articles