Regexp with '&' char using preg_replace

Question

Regexp with '&' char using preg_replace

I am trying to parse urls containing &using preg_replace.

$content = preg_replace('#https?://[a-z0-9._/\?=&-]+#i', '<a href="$0" target="_blank">$0</a>', $content);

But I use it for user comments, so I also use the htmlspecialchars () function to prevent XSS.

function formatContributionContent($content)
{
    $content = nl2br(htmlspecialchars($content));

    // Regexp for mails
    $content = preg_replace('#[a-z0-9._-]+@[a-z0-9._&-]{2,}\.[a-z]{2,4}#', '<a href="mailto:$0">$0</a>', $content);

    // Regexp for urls
    $content = preg_replace('#https?://[a-z0-9._/\?=&-]+#i', '<a href="$0" target="_blank">$0</a>', $content);

    var_dump($content);
}

formatContributionContent('https://openclassrooms.com/index.php?page=3&skin=blue');

And htmlspecialchars converts &to "&", so my regular expression produces the wrong result. Indeed, with the following URL.

http://www.siteduzero.com/index.php?page=3&skin=blue

I get ;

<a href="https://openclassrooms.com/index.php?page=3&amp" target="_blank">https://openclassrooms.com/index.php?page=3&amp</a>;skin=blue

+4

html php regex preg-replace

Maluna34 Sep 06 '15 at 7:46

source share

1 answer

scandel · Answer 1 · 2015-09-06T08:24:58+0000

You can add ";" in a list of characters matching your regular expression, for example:

$content = preg_replace('#https?://[a-z0-9._/\?=&;-]+#i', '<a href="$0" target="_blank">$0</a>', $content);

, "&" "&" htmlspecialchars, URL-.

Regexp with '&' char using preg_replace

More articles: