Prefix mapping in python

I have a line like:

" This is such an nice artwork" 

and I have tag_list ["art","paint"]

Basically, I want to write a function that takes this line and taglist as inputs and returns the word "work of art" to me, because the work contains the word art, which is in the taglist.

How to do it most efficiently?

I want it to be effective in terms of speed

  def prefix_match(string, taglist): # do something here return word_in string 
+6
source share
3 answers

Try the following:

 def prefix_match(sentence, taglist): taglist = tuple(taglist) for word in sentence.split(): if word.startswith(taglist): return word 

This works because str.startswith() can take a tuple of prefixes as an argument.

Note that I renamed string to sentence , so there is no ambiguity with the string module.

+7
source

Try the following:

 def prefix_match(s, taglist): words = s.split() return [w for t in taglist for w in words if w.startswith(t)] s = "This is such an nice artwork" taglist = ["art", "paint"] prefix_match(s, taglist) 

The above will return a list with all the words in the string that match the prefix in the tag list.

+2
source

Here is a possible solution. I use regex because I can easily get rid of punctuation characters. In addition, I use collections.Counter , this can increase efficiency if your line contains many duplicate words.

 tag_list = ["art","paint"] s = "This is such an nice artwork, very nice artwork. This is the best painting I've ever seen" from collections import Counter import re words = re.findall(r'(\w+)', s) dicto = Counter(words) def found(s, tag): return s.startswith(tag) words_found = [] for tag in tag_list: for k,v in dicto.iteritems(): if found(k, tag): words_found.append((k,v)) 

The last part can be done with a list:

 words_found = [[(k,v) for k,v in dicto.iteritems() if found(k,tag)] for tag in tag_list] 

Result:

 >>> words_found [('artwork', 2), ('painting', 1)] 
+1
source

Source: https://habr.com/ru/post/916501/


All Articles