Regular expression for a third-person verb

Question

Regular expression for a third-person verb

I am trying to create a regular expression that matches the third-person form of a verb created using the following rule:

If the verb ends with e, which is not preceded by i, o, s, x, z, ch, sh, add s.

So, I am looking for a regular expression matching a word consisting of several letters, then not i, o, s, x, z, ch, sh, and then "es". I tried this:

\b\w*[^iosxz(sh)(ch)]es\b

According to regex101, it corresponds to “loves,” “hates,” etc. However, it does not match the “baths,” why is it not?

+6

python regex

maestromusica Nov 13 '16 at 9:50

source share

2 answers

If you want to match strings ending in e and not preceded by i , o , s , x , z , ch , sh , you should

 (?<!i|o|s|x|z|ch|sh)e

Your regular expression [^iosxz(sh)(ch)] consists of a group of characters , ^ just negates, and the rest will exactly match, therefore it is equivalent to:

 [^io)sxz(c]

which actually means: "match everything that is not one of the" io "sxz (c".

+1

Maroun Nov 13 '16 at 9:56

source share

Wiktor stribiżew · Accepted Answer · 2016-11-13T10:12:52+0000

you can use

 \b(?=\w*(?<![iosxz])(?<![cs]h)es\b)\w*

Watch the regex demo

Since Python re does not support variable-length alternatives in lookbehind, you need to break down the conditions into two lookbehinds here.

Template Details :

\b - upper word boundary
(?=\w*(?<![iosxz])(?<![cs]h)es\b) is a positive result that requires a sequence:
- \w* - characters + + +
- (?<![iosxz]) - there should not be i , o , s , x , z characters right in front of the current location and ...
- (?<![cs]h) - no ch or sh right in front of the current location ...
- es - es follows ...
- \b - at the end of a word
\w* - zero or more (maybe + better here to match 1 or more) word characters.

See Python Demo :

 import re r = re.compile(r'\b(?=\w*(?<![iosxz])(?<![cs]h)es\b)\w*') s = 'it matches "likes", "hates" etc. However, it does not match "bathes", why doesn\'t it?' print(re.findall(r, s))

Regular expression for a third-person verb

More articles: