$1"; $...">

Why does my PHP regular expression parse Markdown faults?

$pattern = "/\[(.*?)\]\((.*?)\)/i"; $replace = "<a href=\"$2\" rel=\"nofollow\">$1</a>"; $text = "blah blah [LINK1](http://example.com) blah [LINK2](http://sub.example.com/) blah blah ?"; echo preg_replace($pattern, $replace, $text); 

The above works, but if a space is accidentally inserted between [] and (), everything breaks and the two links are mixed into one:

 $text = "blah blah [LINK1] (http://example.com) blah [LINK2](http://sub.example.com/) blah blah ?"; 

I have a feeling that this is a star that breaks it, but does not know how to combine duplicate links.

+6
source share
2 answers

If I understand correctly, everything you need to do also matches any number of spaces between them, for example:

 /\[([^]]*)\] *\(([^)]*)\)/i 

Explanation:

 \[ # Matches the opening square bracket (escaped) ([^]]*) # Captures any number of characters that aren't close square brackets \] # Match close square bracket (escaped) * # Match any number of spaces \( # Match the opening bracket (escaped) ([^)]*) # Captures any number of characters that aren't close brackets \) # Match the close bracket (escaped) 

Justification:

I should probably justify that the reason I changed yours .*? on [^]]*

The second version is more efficient because it does not need to do the huge amount of backtracking that it does .*? . In addition, after the discovery [ , version .*? will continue to search until it finds a match, instead of failing if it is not the tag that we need. For example, if we match the expression with .*? against:

 Sad face :[ blah [LINK1](http://sub.example.com/) blah 

he will match

 [ blah [LINK1] 

and

 http://sub.example.com/ 

Using the approach [^]]* means that the input is correctly matched.

+7
source

Try the following:

 $pattern = "/\[(.*?)\]\s?\((.*?)\)/i"; 

\s? added between \[(.*?)\] and \((.*?)\)

0
source

Source: https://habr.com/ru/post/915603/


All Articles