Check if the string contains only the given characters

Question

Check if the string contains only the given characters

What is the easiest way to check if a string contains only certain characters in Python? (Without using RegEx or anything else, of course)

In particular, I have a list of bites, and I want to filter out all of them, except for words that ONLY consist of ANY letters on another line. For example, filtering ['aba', 'acba', 'caz'] , although 'abc' should give ['aba', 'acba'] . ( z not in abc )

Same as once saving items that can be made using the given letters.

+6

python string

Jollywatt Sep 09 '13 at 9:12

source share

7 answers

You can use sets :

 >>> l = ['aba', 'acba', 'caz'] >>> s = set('abc') >>> [item for item in l if not set(item).difference(s)] ['aba', 'acba']

+8

alecxe Sep 09 '13 at 9:19

source share

Assuming that you only need lines in your list that have only characters in your search string, you can easily do

 >>> hay = ['aba', 'acba', 'caz'] >>> needle = set('abc') >>> [h for h in hay if not set(h) - needle] ['aba', 'acba']

If you cannot avoid the sets, you can also do the same using str.translate . In this case, you delete all characters that are in the search bar.

 >>> needle = 'abc' >>> [h for h in hay if not h.translate(None,needle)] ['aba', 'acba']

+6

Abhijit Sep 09 '13 at 9:19

source share

Something like that:

 strings = ['aba', 'acba', 'caz'] given = "abc" filter(lambda string: all(char in given for char in string), strings)

+4

Bleeding fingers Sep 09 '13 at 9:19

source share

The question is somewhat controversial regarding the reuse of letters from the base line. Or, if they should or should not be repeated, or skip skipped letters. This solution addresses this with a function that includes the reuse parameter:

 from collections import Counter def anagram_filter(data, base, reuse=True): if reuse: # all characters in objects in data are in base, count ignored base = set(base) return [d for d in data if not set(d).difference(base)] r = [] cb = Counter(base) for d in data: for k, v in Counter(d).iteritems(): if (k not in cb.keys()) or (v > cb[k]): break else: r.append(d) return r

Using:

 >>> anagram_filter(['aba', 'acba', 'caz'], 'abc') ['aba', 'acba'] >>> anagram_filter(['aba', 'acba', 'caz'], 'abc', False) [] >>> anagram_filter(['aba', 'cba', 'caz'], 'abc', False) ['cba']

+1

Inbar rose Sep 09 '13 at 9:37

source share

Below is the code:

 a = ['aba', 'acba', 'caz'] needle = 'abc' def onlyNeedle(word): for letter in word: if letter not in needle: return False return True a = filter(onlyNeedle, a) print a

0

Snowwolf Sep 09 '13 at 9:23

source share

I assume your reluctance for regexp is not a problem:

 strings = ['aba', 'acba', 'caz'] given = "abc" filter(lambda value: re.match("^[" + given + "]$", value), strings)

0

njzk2 Sep 09 '13 at 9:29

source share

Andrew Gorcester · Accepted Answer · 2013-09-09T09:22:51+0000

Assuming the discrepancy in your example is a typo, then this should work:

 my_list = ['aba', 'acba', 'caz'] result = [s for s in my_list if not s.strip('abc')]

leads to ['aba', 'acba'] . string.strip (characters) will return an empty string if the string to be deleted contains only characters in the input. The order of characters should not matter.

Check if the string contains only the given characters

More articles: