How not to combine the whole word "king" with "king?"

How to check the exact word in a string?

I need to consider cases where the word "king" has a question mark following it, as in the example below.

unigrams , it must be False

In [1]: answer = "king"
In [2]: context = "we run with the king? on sunday"

n_grams , it should be False

In [1]: answer = "king tut"
In [2]: context = "we run with the king tut? on sunday"

unigrams , it must be True

In [1]: answer = "king"
In [2]: context = "we run with the king on sunday"

n_grams , it must be True

In [1]: answer = "king tut"
In [2]: context = "we run with the king tut on sunday"

As mentioned above, for the case of unigram, we can handle this by dividing the string into a list, but this does not work for n_grams.

After reading some posts, I think I should try to process using the look, but I'm not sure.

+4
3

, :

reg_answer = re.compile(r"(?<!\S)" + re.escape(answer) + r"(?!\S)")

. - Python

  • (?<!\S) - lookbehind, .
  • re.escape(answer) - ,
  • (?!\S) - , , .
+4
return answer in context.split():

>>> answer in context.split()
False

.

:

all([ans in context.split() for ans in answer.split()])

"king tut", , :

"we tut with the king"

, ( , , ), , ( .split()):

def ngram_in(match, string):
    matches = match.split()
    if len(matches) == 1:
        return matches[0] in string.split()
    words = string.split()
    words_len = len(words)
    matches_len = len(matches)
    for index, word in enumerate(words):
        if index + matches_len > words_len:
            return False
        if word == matches[0]:
            for match_index, match in enumerate(matches):
                potential_match = True
                if words[index + match_index] != match:
                    potential_match = False
                    break
            if potential_match == True:
                return True
    return False

O(n*m) , .

>>> ngram_in("king", "was king tut a nice dude?")
True
>>> ngram_in("king", "was king? tut a nice dude?")
False
>>> ngram_in("king tut a", "was king tut a nice dude?")
True
>>> ngram_in("king tut a", "was king tut? a nice dude?")
False
>>> ngram_in("king tut a", "was king tut an nice dude?")
False
>>> ngram_in("king tut", "was king tut an nice dude?")
True

+5

Why not check:

if answer in context: do stuff

Check this post for more details.

0
source

Source: https://habr.com/ru/post/1672873/


All Articles