RegEx: pattern matching in pattern - I think I need to use Positive Lookbehinds?

I am trying to use RegEx to find a pattern in a pattern. In particular, what I want to do is grab the URL into the link and search inside this for everything that happens after the last = sign and grab it.

So this line

<a href="http://my.domain.com/?s_cid=EM&s_ev9=CMC21892&s_ev10=EM_CMC21892_LC_stuff" style="color: #365EBF:">stuff</a>

I would first find

href="http://my.domain.com/?s_cid=EM&s_ev9=CMC21892&s_ev10=EM_CMC21892_LC_stuff"

Using this RegEx: href="(https?[^"]*)"

From there, I was able to parse the actual line (when viewing the captured group). I am looking EM_CMC21892_LC_stuffas follows:=[^"=]*$

I have no success, though, when I try to combine these two to execute them in one RegEx.

Any thoughts?

+3
source share
2 answers

He's right, using regular expressions to parse HTML, just asks about problems.

, href="http[^"]+=([^"]+?)".

0

html/url regex ( , , ...)

, , , :

/href="([^"]*=([^"]*))"/

edit to add: : , URL- , :

Array
(
    [0] => Array
        (
            [0] => href="http://my.domain.com/?s_cid=EM&s_ev9=CMC21892&s_ev10=EM_CMC21892_LC_stuff"
        )

    [1] => Array
        (
            [0] => http://my.domain.com/?s_cid=EM&s_ev9=CMC21892&s_ev10=EM_CMC21892_LC_stuff
        )

    [2] => Array
        (
            [0] => EM_CMC21892_LC_stuff
        )

)
0

Source: https://habr.com/ru/post/1789126/


All Articles