Replace the whole string if it contains a substring in pandas

Question

Replace the whole string if it contains a substring in pandas

I want to replace all lines containing a specific substring. So, for example, if I have this data framework:

import pandas as pd
df = pd.DataFrame({'name': ['Bob', 'Jane', 'Alice'], 
                   'sport': ['tennis', 'football', 'basketball']})

I could replace soccer with the string “ball sport” as follows:

df.replace({'sport': {'football': 'ball sport'}})

I want, however, to replace everything that contains ball(in this case, footballand basketball) with "ball sport". Something like that:

df.replace({'sport': {'[strings that contain ball]': 'ball sport'}})

+4

python pandas

sk8r Sep 29 '16 at 11:05

source share

4 answers

apply . x - "":

df.sport = df.sport.apply(lambda x: 'ball sport' if 'ball' in x else x)

+3

DeepSpace 29 . '16 11:07

you can use str.replace

df.sport.str.replace(r'(^.*ball.*$)', 'ball sport')

0        tennis
1    ball sport
2    ball sport
Name: sport, dtype: object

reassign using

df['sport'] = df.sport.str.replace(r'(^.*ball.*$)', 'ball sport')
df

+3

piRSquared Sep 29 '16 at 11:10

source share

Different str.contains

 df['support'][df.name.str.contains('ball')] = 'ball support'

0

Axis Feb 09 '18 at 3:26

source share

Edchum · Accepted Answer · 2016-09-29T11:06:46+0000

You can use str.containsto mask lines containing "ball" and then overwrite the new value:

In [71]:
df.loc[df['sport'].str.contains('ball'), 'sport'] = 'ball sport'
df

Out[71]:
    name       sport
0    Bob      tennis
1   Jane  ball sport
2  Alice  ball sport

To make a case insensitive case = False:

df.loc[df['sport'].str.contains('ball', case=False), 'sport'] = 'ball sport'

Replace the whole string if it contains a substring in pandas

More articles: