Replace the whole string if it contains a substring in pandas

I want to replace all lines containing a specific substring. So, for example, if I have this data framework:

import pandas as pd
df = pd.DataFrame({'name': ['Bob', 'Jane', 'Alice'], 
                   'sport': ['tennis', 'football', 'basketball']})

I could replace soccer with the string “ball sport” as follows:

df.replace({'sport': {'football': 'ball sport'}})

I want, however, to replace everything that contains ball(in this case, footballand basketball) with "ball sport". Something like that:

df.replace({'sport': {'[strings that contain ball]': 'ball sport'}})
+4
source share
4 answers

You can use str.containsto mask lines containing "ball" and then overwrite the new value:

In [71]:
df.loc[df['sport'].str.contains('ball'), 'sport'] = 'ball sport'
df

Out[71]:
    name       sport
0    Bob      tennis
1   Jane  ball sport
2  Alice  ball sport

To make a case insensitive case = False:

df.loc[df['sport'].str.contains('ball', case=False), 'sport'] = 'ball sport'
+6
source

apply . x - "":

df.sport = df.sport.apply(lambda x: 'ball sport' if 'ball' in x else x)
+3

you can use str.replace

df.sport.str.replace(r'(^.*ball.*$)', 'ball sport')

0        tennis
1    ball sport
2    ball sport
Name: sport, dtype: object

reassign using

df['sport'] = df.sport.str.replace(r'(^.*ball.*$)', 'ball sport')
df

enter image description here

+3
source

Different str.contains

 df['support'][df.name.str.contains('ball')] = 'ball support'
0
source

Source: https://habr.com/ru/post/1656210/


All Articles