Python Replace integer values ​​in Dataframe strings and not substrings

I am trying to replace rows in a dataframe if the whole row is equal to another row. I do not want to replace substrings.

So:

If I have df:

 Index  Name       Age
   0     Joe        8
   1     Mary       10
   2     Marybeth   11

and I want to replace "Mary" when the whole line matches "Mary" with "Amy", so I get

 Index  Name       Age
   0     Joe        8
   1     Amy        10
   2     Marybeth   11

I do the following:

df['Name'] = df['Name'].apply(lambda x: x.replace('Mary','Amy'))

My understanding from the search is that the default values ​​are replaceset regex=Falseand replaceshould look for the entire value in the dataframe as "Mary". Instead, I get this result:

 Index  Name       Age
   0     Joe        8
   1     Amy        10
   2     Amybeth   11

What am I doing wrong?

+4
source share
3 answers

Explanation:

When you apply it like this: you work with strings, not with Pandas Series:

In [42]: df['Name'].apply(lambda x: print(type(x)))
<class 'str'>  # <---- NOTE
<class 'str'>  # <---- NOTE
<class 'str'>  # <---- NOTE
Out[42]:
0    None
1    None
2    None
Name: Name, dtype: object

This is the same as:

In [44]: 'Marybeth'.replace('Mary','Amy')
Out[44]: 'Amybeth'

Decision:

Series.replace(to_replace = None, value = None, inplace = False, limit = None, regex = False, method = 'pad', axis = None) ( Series.apply()) - (regex=False) - , :

In [39]: df.Name.replace('Mary','Amy')
Out[39]:
0         Joe
1         Amy
2    Marybeth
Name: Name, dtype: object

regex=True, :

In [40]: df.Name.replace('Mary','Amy', regex=True)
Out[40]:
0        Joe
1        Amy
2    Amybeth
Name: Name, dtype: object

: Series.str.replace(pat, repl, n = -1, case = None, flags = 0) doesn ' t regex - pat repl RegEx:

In [41]: df.Name.str.replace('Mary','Amy')
Out[41]:
0        Joe
1        Amy
2    Amybeth
Name: Name, dtype: object
+1

replace + dict - ( DataFrame Series.str.replace)

df['Name'].replace({'Mary':'Amy'})
Out[582]: 
0         Joe
1         Amy
2    Marybeth
Name: Name, dtype: object
df['Name'].replace({'Mary':'Amy'},regex=True)
Out[583]: 
0        Joe
1        Amy
2    Amybeth
Name: Name, dtype: object

,

Series: https://pandas.pydata.org/pandas-docs/stable/generated/pandas.Series.str.replace.html

DataFrame: https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.replace.html

+4

loc, , , .

df.loc[df['Name'] == 'Mary', 'Name'] = "Amy"
+2

Source: https://habr.com/ru/post/1692057/


All Articles