How does the following regular expression work?

Let's say I have a line in which I would like to parse from a double double quote to a closing double quote:

asdf"pass\"word"asdf

I was fortunate enough to find that the next PCRE would match from a double double quote to a closing double quote, while ignoring the escaped double quote in the middle (to properly parse the logical unit):

".*?(?:(?!\\").)"

Match:

"pass\"word"

However, I have no idea why this PCRE matches a properly opening and closing double quote.

I know the following:

"= literal double quote

. *? = lazy match zero or more of any character

(?: = open a group not participating in the survey

(?! \ ") = states that it is impossible to match a literal \"

. = single character

) = close the group without capturing

"= literal double quote

, . , PCRE " , " , ".

, , PCRE .

- ?

+4
2

, .

Lazy matching ( , , , ). " " , , .*? r, lookahead + . d,

:, :

r? lookahead \" ? ,

, , . , , .*? \", (?:(?!\\").)

.*? , regex , .

2:

, : ".*?[^\\]", , , .

A () lookbehind : ".*?(?<!\\)", "" ( ), lookbehinds aren ' t / ( pcre, , bash, , , grep -P '[pattern]' .., perl).

+2

Crayon Violent explain, , ( , ).

-, , "PCRE" (Perl Compatible Regular Expression), ( ) "pattern", , ( ).

Bash:

A='asdf"pass\"word"asdf'
pattern='"(([^"\\]|\\.)*)"'

[[ $A =~ $pattern ]]
echo ${BASH_REMATCH[1]}

: pattern='"(([^"\\]+|\\.)*)"'

PCRE , :

"([^"\\]*+(?:\\.[^"\\])*+)"

, . : "abc\\\"def" ( ), "abcdef\\\\" ( , ).

0

Source: https://habr.com/ru/post/1607123/


All Articles