Regex ignore URLs already in HTML tags

I have a little problem with my regex

I made my own BBcode for my site, but I also want the URLs to be processed as well.

I use preg_replace and this is the pattern used to identify URLS:

/([\w]+:\/\/[\w-?&;#~=\.\/\@]+[\w\/])/is 

Which works fine, however, if the URL is in the [img] [/ img] block, the above template also picks it up and produces this result:

 //[img]http://url.com/toimg.jeg[/img] will produce this result: <img src="<a href="http://url.com/toimg.jeg" target="_blank">/> //When it should produce: <img src="http://url.com/toimg.jeg"/> 

I tried using this:

 /([^"][\w]+:\/\/[\w-?&;#~=\.\/\@]+[\w\/][^"])/is 

Bad luck.

Any help would be appreciated.

Edit: For a solution See second comment on stema's answer.

+4
source share
1 answer

try it

 (?<!href=")(\b[\w]+:\/\/[\w-?&;#~=\.\/\@]+[\w\/]) 

See here at Regexr

To make it more general, you can simplify your lookbehind to only check for "=" "

 (?<!=")(\b[\w]+:\/\/[\w-?&;#~=\.\/\@]+[\w\/]) 

Watch it at Regexr

(?<!href=") is a negative lookbehind statement, it ensures that there is no" href = "" in front of your template.

\b is the word boundary that binds the beginning of your link to the change from the word's non-word character. without this, lookbehind will be useless, and it will match "ttp: // ..." on.

+1
source

Source: https://habr.com/ru/post/1399789/


All Articles