Regexp with '&' char using preg_replace

I am trying to parse urls containing &using preg_replace.

$content = preg_replace('#https?://[a-z0-9._/\?=&-]+#i', '<a href="$0" target="_blank">$0</a>', $content);

But I use it for user comments, so I also use the htmlspecialchars () function to prevent XSS.

function formatContributionContent($content)
{
    $content = nl2br(htmlspecialchars($content));

    // Regexp for mails
    $content = preg_replace('#[a-z0-9._-]+@[a-z0-9._&-]{2,}\.[a-z]{2,4}#', '<a href="mailto:$0">$0</a>', $content);

    // Regexp for urls
    $content = preg_replace('#https?://[a-z0-9._/\?=&-]+#i', '<a href="$0" target="_blank">$0</a>', $content);

    var_dump($content);
}

formatContributionContent('https://openclassrooms.com/index.php?page=3&skin=blue');

And htmlspecialchars converts &to "&amp;", so my regular expression produces the wrong result. Indeed, with the following URL.

http://www.siteduzero.com/index.php?page=3&skin=blue

I get ;

<a href="https://openclassrooms.com/index.php?page=3&amp" target="_blank">https://openclassrooms.com/index.php?page=3&amp</a>;skin=blue
+4
source share
1 answer

You can add ";" in a list of characters matching your regular expression, for example:

$content = preg_replace('#https?://[a-z0-9._/\?=&;-]+#i', '<a href="$0" target="_blank">$0</a>', $content);

, "&" "&amp;" htmlspecialchars, URL-.

+1

Source: https://habr.com/ru/post/1606124/


All Articles