Regular Expression Selection

I have a line like this.

<p class='link'>try</p>bla bla</p>

I want to get only <p class='link'>try</p> I tried this. /<p class='link'>[^<\/p>]+<\/p>/

But that will not work.

How can i do this? Thank,

+3
source share
4 answers

If this is your line and you want text between tags p, then this should work ...

/<p\sclass='link'>(.*?)<\/p>/

The reason yours doesn't work is because you add <\/p>characters to your range. This does not correspond literally, but not every character checks separately.

Of course, I’m sure to mention that there are more efficient tools for parsing HTML snippets (such as the HTML parser).

+4
source
'/<p[^>]+>([^<]+)<\/p>/'

make you try

0
source

It looks like you used this block: [^<\/p>]+intending to match anything but </p>. Unfortunately, this is not what he does. A block []matches any of the characters inside. In your case, the part /<p class='link'>[^<\/p>]+corresponds <p class='link'>try</, but the expected ones did not immediately follow </p>, so there was no coincidence.

Alex's decision to use a non-greedy classifier is how I tend to approach this problem.

0
source

I tried to make it less specific for any particular tag.

(<[^/]+?\s+[^>]*>[^>]*>)

this returns:

<p class='link'>try</p>

0
source

Source: https://habr.com/ru/post/1788983/


All Articles