Python count the number of substrings in a list from another list of strings without duplicates

Question

Python count the number of substrings in a list from another list of strings without duplicates

I have two lists:

main_list = ['Smith', 'Smith', 'Roger', 'Roger-Smith', '42']
master_list = ['Smith', 'Roger']

I want to count the number of times when I find the line from master_list in the main_list line, not counting the same element twice.

Example: for these two lists, the result of my function should be 4. "Smith" can be obtained 3 times in the main list. “Roger can be found 2 times, but since Smith has already been found in Roger Smith, this one is no longer taken into account, so Roger just counts 1, which is 4 in total.

The function I wrote for review is below, but I think there is a faster way to do this:

def string_detection(master_list, main_list):
    count = 0
    for substring in master_list:
        temp = list(main_list)
        for string in temp:
            if substring in string:
                main_list.remove(string)
                count+=1
    return count

+4

python string list

erwanlc Feb 16 '17 at 10:30

source share

6

pandas ( ) str.contains sum()

import pandas as pd
main_list = pd.Series(['Smith', 'Smith', 'Roger', 'Roger-Smith', '42'])
master_list = ['Smith', 'Roger']
count = main_list.str.contains('|'.join(master_list)).sum()

+2

yuval 16 . '17 10:38

-. , main_list, master_list

temp_list = [ string for string in main_list if any(substring in string for substring in master_list)]

temp_list :

['Smith', 'Smith', 'Roger', 'Roger-Smith']

, temp_list - .

+2

Yevhen Kuzmovych 16 . '17 10:42

main_list = ['Smith', 'Smith', 'Roger', 'Roger-Smith', '42']
master_list = ['Smith', 'Roger']

print len([word for word in main_list if any(mw in word for mw in master_list)])

+2

Elmex80s 16 . '17 10:42

:

main_list = ['Smith', 'Smith', 'Roger', 'Roger-Smith', '42']
master_list = ['Smith', 'Roger']

i = 0
for elem in main_list:
    if elem in master_list:
        i += 1
        continue
    for master_elem in master_list:
        if master_elem in elem:
            i += 1
            break

print(i) # i = 4

The code above 'Roger-Smith'is 1, if you want it to count as multiple, delete break.

0

Olian04 Feb 16 '17 at 10:42

source share

If your master_list is not expected to be huge, one way to do this is with regex:

import re

def string_detection(master_list, main_list):
    count = 0
    master = re.compile("|".join(master_list))
    for entry in main_list:
        if master.search(entry):
            count += 1
    return count

0

zwer Feb 16 '17 at 10:48

source share

Paul Rooney · Accepted Answer · 2017-02-16T10:44:41+0000

>>>sum(any(m in L for m in master_list) for L in main_list)
4

main_list , any master_list . bool. , , . sum True, .

Python count the number of substrings in a list from another list of strings without duplicates

More articles: