Why does "abcdef" not match (? = Abc) def, but match abc (? = Def)?

In Javascript, I have an abcdef line and cannot understand this strange behavior:

  • (?=abc)def does not match the string
  • abc(?=def) matches the string

Why?

+6
source share
4 answers

In (?=abc)def capture (?=abc) is zero width and does not move the cursor forward in the input line after a successful match. This construct simply says, looking ahead at the next three characters to see if they are abc if they then check to see if the same characters are def . At this point, the match fails.

You need to understand how the regex engine works to complete the match. Consider your abcdef input line and your regular expression abc(?=def) . The engine starts by matching with a , then moves the cursor inside the input line to the next character and tries to match b , because the cursor in the input line is at b , the match succeeds. Then the engine moves the cursor inside the input line and tries to combine c , and since the cursor is in the input line by c , the match is successful, and the cursor in the input line moves again to the next character. Now the engine encounters (?=def) , at that moment the engine simply looks ahead to see if the next three characters, of which the cursor is in the input sting, are actually def without moving the cursor that they are, and the match succeeds.

Now consider the input string xyz and the regular expression x(?=y)Z The regex engine puts the cursor in the first letter of the input line and checks if it is x and finds that x , so it moves the cursor to the next character in the input line. Now he expects whether the next character is y , as it is, but the engine does not move the entered text cursor preface, so the cursor in the input text remains at y . The engine then looks to see if the cursor is on the letter z , but since the cursor in the input text is still on the letter y , no match is made.

You can read a lot more about positive and negative images at http://www.regular-expressions.info/lookaround.html

+18
source

(?=...) is a view, in other words, that checks the line to the right. Also note that lookahead is a zero-width statement that does not use a character. In the first example: (?=abc) , which means that abc must be followed by a def . This is why the template does not work.

In your second example, it finds def after abc , then the string matches

+4
source

Defining MDNs in javascript

x(?=y)
Matches "x" only if "x" is followed by "y". This is called lookahead.

For example, /Jack(?=Sprat)/ matches "Jack" only if it is followed by "Sprat". /Jack(?=Sprat|Frost)/ matches "Jack" only if followed by "Sprat" or "Frost". However, neither Sprat nor Frost are part of the match.

So, (?=y) preceded by another operator, in this case an empty string, then it will correspond only to that if the first operator is followed by the second. Without the leading statement, the expression (?="abc") will match the first three characters of abc without capturing them, and then check again to determine if these characters are def, which will fail.

+2
source

Based on your answer to my comment, I think you want a positive look :

 (?<=abc)def 

Edit:

Since you are using JavaScript (sorry, I just read your question - I didn’t look at the tags), why not just use a regular capture group and enable match in the replacement template?

 "abcdef".replace(/(abc)def/, "$1") 
+2
source

Source: https://habr.com/ru/post/946912/


All Articles