Conditional regular expression match for accelerated apostrophe

$str = "'ei-1395529080',0,0,1,1,'Name','email@domain.com','Sentence with \'escaped apostrophes\', which \'should\' be on one line!','no','','','yes','6.50',NULL";

preg_match_all("/(')?(.*?)(?(1)(?!\\\\)'),/s", $str.',', $values);
print_r($values);

I am trying to write a regex with these goals:

  • Returns an array of divided values ,(note that I'm adding to $strin line 2)
  • If the array element starts with ', match the closure'
  • But if it is escaped as \', continue to commit the value until it is found 'without the previous\

If you try these lines, it is wrong when it occurs \',

Can someone explain what is happening and how to fix it? Thanks.

+4
source share
2 answers

Here's how I decided to solve it:

('(?>\\.|.)*?'|[^\,]+)

Regex101

Explanation:

(              Start capture group
    '          Match an apostrophe
    (?>        Atomically match the following
        \\.    Match \ literally and then any single character
        |.     Or match just any single character
    )          Close atomic group
    *?'        Match previous group 0 or more times until the first '
    |[^\,]     OR match any character that is not a comma (,)
    +          Match the previous regex [^\,] one or more times
)              Close capture group

, :

, 'a \' b'

(?>\\.|.) :

  • '
  • a
  • \'
  • b
  • '

, \' \, ', / , .


, , : ('(?>\\\\.|.)*?'|[^\\,]+)


10 regex , . , , . ? /? ~ ..

, .

:

"This is on one line.\nThis is on another line."

\n :

"This is on one line.
 This is on another line."

. , , . :

"[^\n]*"

\n :

"[^
 ]*"

, , , . , , . , , \n - ( escape-: \r, \t, \\ ..). escape- \n escape-, \\, n. .

"[^\\n]*"

, , :

"[^\n]*"

, \\ - escape-, " \\ \". \\ , - n , , escape-.

, 4 ? :

(?>\\.|.)

, , . . (\\.) " , , , ". , .

\\ \\ .

, :

(?>\\\\.|.)
+3

- : (?:'([^'\\]*(?:\\.[^'\\]*)*)'|([^,]+))

Regular expression visualization

# (?:'([^'\\]*(?:\\.[^'\\]*)*)'|([^,]+))
# 
# Options: Case sensitive; Exact spacing; Dot doesn’t match line breaks; ^$ don’t match at line breaks; Greedy quantifiers
# 
# Match the regular expression below «(?:'([^'\\]*(?:\\.[^'\\]*)*)'|([^,]+))»
#    Match this alternative (attempting the next alternative only if this one fails) «'([^'\\]*(?:\\.[^'\\]*)*)'»
#       Match the character "'" literally «
#       Match the regex below and capture its match into backreference number 1 «([^'\\]*(?:\\.[^'\\]*)*)»
#          Match any single character NOT present in the list below «[^'\\]*»
#             Between zero and unlimited times, as many times as possible, giving back as needed (greedy) «*»
#             The literal character "'" «
#             The backslash character «\\»
#          Match the regular expression below «(?:\\.[^'\\]*)*»
#             Between zero and unlimited times, as many times as possible, giving back as needed (greedy) «*»
#             Match the backslash character «\\»
#             Match any single character that is NOT a line break character (line feed) «.»
#             Match any single character NOT present in the list below «[^'\\]*»
#                Between zero and unlimited times, as many times as possible, giving back as needed (greedy) «*»
#                The literal character "'" «'»
#                The backslash character «\\»
#       Match the character "'" literally «'»
#    Or match this alternative (the entire group fails if this one fails to match) «([^,]+)»
#       Match the regex below and capture its match into backreference number 2 «([^,]+)»
#          Match any character that is NOT a "," «[^,]+»
#             Between one and unlimited times, as many times as possible, giving back as needed (greedy) «+»

https://regex101.com/r/pO0cQ0/1

preg_match_all('/(?:\'([^\'\\\\]*(?:\\\\.[^\'\\\\]*)*)\'|([^,]+))/', $subject, $result, PREG_SET_ORDER);
for ($matchi = 0; $matchi < count($result); $matchi++) {
    // @todo here use $result[$matchi][1] to match quoted strings (to then process escaped quotes)
    // @todo here use $result[$matchi][2] to match unquoted strings
}
+2

Source: https://habr.com/ru/post/1619433/


All Articles