AS3 RegExp for matching words with border type characters

I want to match a list of words that is simple enough when these words are true words. For example /\b (pop|push) \b/gsx when starting a line

Pop gave the door a push, but he threw back

will match the words pop and click but not popped up.

I need similar functionality for words containing characters that usually qualify as word boundaries. So I need /\b (reverse!|push) \b/gsx when running the line

click reverse! the opposite! push

to match just the opposite! and click, but not the opposite! push Obviously, this regular expression is not going to do this, and what should I use instead of \ b to make my regular expression smart enough to handle these funky requirements?

+1
source share
3 answers

At the end of the word, \ b means that the previous character was the character of the word, and the next character (if there is the next character) is not the character of the word. You want to abandon the first condition, because to be a symbol without a word at the end of the "word". This leaves you with a negative look:

 /\b (reverse!|push) (?!\w)/gx 

I am sure that AS3 regular expressions support lookahead.

+2
source

Your first problem is that you need three (maybe four) cases in your rotation, not two.

  • /\breverse!(?:\s|$)/ reverse! in itself
  • /\bpush\b/ click on its own
  • /\breverse!push\b/ together
  • /\bpushreverse!(?:\s|$)/ this is a possible case

The second problem is that \b will not match after "!" because it is not \w . Here's what Perl 5 says about \b , you can refer to your docs to see if they agree:

The word boundary ("\ b") is the spot between two characters having "\ w" on one side of it and "\ W" on the other side (in any order), counting imaginary characters from the beginning and end of the line in accordance with "\ W". (Inside the character classes, "\ b" is the backspace, not the word boundary, as is usually done on any line with two quotation marks.)

So the regex that you need is like

 / \b ( reverse!push | reverse! | push ) (?: \s | \b | $ )+ /gx; 

I left /s because there are no periods in this regex, so it makes no sense to relate to one line. If /s doesn’t mean what to consider as one line in your engine, you should probably add it back. In addition, you should read how your engine handles rotation. I know that in Perl 5 you need to arrange the elements in this way for proper behavior (otherwise the opposite! The opposite will always win!).

0
source

You can replace \ b with something equivalent, but less strict:

 /(?<=\s|^)(reverse!|push)(?=\s|$)/g 

Thus, the limiting factor is \b (which can only match before or after the actual character of the word \w ).

Now the space or the beginning / end of the string function as valid delimiters, and the internal expression can be easily built at run time from a list of search terms, for example.

0
source

Source: https://habr.com/ru/post/888622/


All Articles