Regular Expression Problem Look-behind (Ruby)

I wrote this regex to match all href and src links on an HTML page; (I know I should use a parser, this is just experimentation):

/((href|src)\=\").*?\"/ # Without appearance

It works fine, but when I try to change the first part of the expression as an inverse pattern:

/(?<=(href|src)\=\").*?\"/ # With appearance

It throws an error "Invalid reverse lookup pattern." Any ideas what is wrong with the look?

+6
source share
1 answer

Lookbehind has limitations :

  (?<=subexp) look-behind (?<!subexp) negative look-behind Subexp of look-behind must be fixed character length. But different character length is allowed in top level alternatives only. ex. (?<=a|bc) is OK. (?<=aaa(?:b|cd)) is not allowed. In negative-look-behind, captured group isn't allowed, but shy group(?:) is allowed. 

You cannot place alternatives at a non-top level inside a (negative) view.

Put them on the upper level. You also do not need to hide some of the characters you made.

 /(?<=href="|src=").*?"/ 
+11
source

Source: https://habr.com/ru/post/957969/


All Articles