RegEx to return the 'href' attribute of only 'link' tags?

I am trying to create a regex that returns <link>tag hrefs

Why does this regex return all hrefs, including <a hrefs?

    (? <= <link \ s +. *?) href \ s * = \ s * [\ '\ "] [^ \' \"] +
    <link rel = "stylesheet" rev = "stylesheet" 
    href = "idlecore-tidied.css? T_2_5_0_228" media = "screen">
    <a href="anotherurl"> Slash Boxes </a>

Thank you

+3
source share
5 answers

Or

/(?<=<link\b[^<>]*?)\bhref=\s*=\s*(?:"[^"]*"|'[^']'|\S+)/

or

/<link\b[^<>]*?\b(href=\s*=\s*(?:"[^"]*"|'[^']'|\S+))/

The main difference is [^<>]*?instead .*?. This is because you do not want it to continue searching in other tags.

+3

, , , , .

<link\s+[^>]*(href\s*=\s*(['"]).*?\2) Regex Coach s g.

+1
/(?<=<link\s+.*?)href\s*=\s*[\'\"][^\'\"]+[^>]*>/

-, . :

/(<link\s+.*?)href\s*=\s*[\'\"][^\'\"]+[^>]*>/

... Javascript.

0

? Perl, , lookbehind. , (, MizardX):

(?<=<link\b[^<>]*?)href\s*=\s*(['"])(?:(?!\1).)+\1

. , (' ") . ( ) lookbehind:

(?:<link\b[^<>]*?)(href\s*=\s*(['"])(?:(?!\2).)+\2)

\ 1 .

0
(?<=<link\s+.*?)href\s*=\s*[\'\"][^\'\"]+

works with Expresso (I think Expresso works in the .NET regex-engine). You can even improve this a little more to fit the closure 'or ":

(?<=<link\s+.*?)href\s*=\s*([\'\"])[^\'\"]+(\1)

Your regex-engine may not work with lookbehind statements. The workaround would be

(?:<link\s+.*?)(href\s*=\s*([\'\"])[^\'\"]+(\2))

Your match will be in captured group 1.

0
source

Source: https://habr.com/ru/post/1699203/


All Articles