Is there a greedy or a group in regular expression?

I have an automatically generated regular expression that basically represents one large "or" group:

(\bthe\b|\bcat\b|\bin\b|\bhat\.\b|\bhat\b) 

I noticed that in the case of

 hat. 

It will only match the "hat", not the "hat." as i want. Is there a way to make him more greedy?

UPDATE: forgot about word boundaries, sorry for that.

+6
source share
1 answer

Put hat\. up hat in regex. The first matching expression in alternation wins. hat matches hat. therefore hat\. never checked.

It is best to write this part as hat\.? rather than hat\.|hat . This makes the period optional, so you do not need two terms in alternation.

After editing:

There is no word boundary between . and, say, a space (both are symbols of non-words). Thus, \bhat\.\b will only match things like hat.x , where the other letter immediately follows the period. This means that, for example, the sentence hat will match the one that matches. I see that you have found a solution.

+9
source

Source: https://habr.com/ru/post/912720/


All Articles