Removing rows in pandas DataFrame, where does the row contain the row present in the list?

Question

Removing rows in pandas DataFrame, where does the row contain the row present in the list?

I know how to delete rows from a single column ('From') of a pandas DataFrame, where the row contains a row, for example, given dfand somestring:

df = df[~df.From.str.contains(someString)]

Now I want to do something similar, but this time I want to delete any lines containing a line that is in any element of another list . If I did not use pandas, I would use the forand approach if ... not ... in. But how can I use pandas' own functionality to achieve this? Given a list of items to remove ignorethese extracted from the file comma separated lines EMAILS_TO_IGNORE, I tried:

with open(EMAILS_TO_IGNORE) as emails:
        ignorethese = emails.read().split(', ')
        df = df[~df.From.isin(ignorethese)]

Am I bewildered by first putting the file in a list? Given that this is a comma delimited text file, can I get around this with something simpler?

+4

python python-2.7 pandas

Pyderman 18 sept. '15 at 5:57

source share

1 answer

Anand s kumar · Accepted Answer · 2015-09-18T06:03:36+0000

Series.str.containssupports regular expression, you can create a regular expression from your list of letters to ignore using |to to ORthem, and then use this in contains. Example -

df[~df.From.str.contains('|'.join(ignorethese))]

Demo -

In [109]: df
Out[109]:
                                         From
0         Grey Caulfu <grey.caulfu@ymail.com>
1  Deren Torculas <deren.e.torcs87@gmail.com>
2    Charlto Youna <youna.charlto4@yahoo.com>

In [110]: ignorelist = ['grey.caulfu@ymail.com','deren.e.torcs87@gmail.com']

In [111]: ignorere = '|'.join(ignorelist)

In [112]: df[~df.From.str.contains(ignorere)]
Out[112]:
                                       From
2  Charlto Youna <youna.charlto4@yahoo.com>

Please note that, as stated in the documentation , it uses re.search().

Removing rows in pandas DataFrame, where does the row contain the row present in the list?

More articles: