Matching everything until the next match

I want to match the html code until the next occurrence ... or the end.

I currently have the following regex:

(<font color=\"#777777\">\.\.\. .+?<\/font>) 

Which will correspond exactly to this:

 1. <font color="#777777">... </font><font color="#000000">lives up to the customer expectations. The subscriber is </font> 2. <font color="#777777">... You may not want them to be </font> 3. <font color="#777777">... </font><font color="#000000">the web link, and </font> 

But I would like to:

 1. <font color="#777777">... </font><font color="#000000">lives up to the customer expectations. The subscriber is </font><font color="#777777">obviously thinking about your merchandise </font><font color="#000000">in case they have clicked about the link in your email.</font> 2. <font color="#777777">... You may not want them to be </font><font color="#000000">disappointed by simply clicking </font> 3. <font color="#777777">... </font><font color="#000000">the web link, and </font><font color="#777777">finding </font><font color="#000000">the page to </font><font color="#777777">get other than </font><font color="#000000">what they thought it </font><font color="#777777">will be.. If America makes</font> 

Here is the html I want to parse:

 <font color="#777777">... </font><font color="#000000">lives up to the customer expectations. The subscriber is </font><font color="#777777">obviously thinking about your merchandise </font><font color="#000000">in case they have clicked about the link in your email.</font><font color="#777777">... You may not want them to be </font><font color="#000000">disappointed by simply clicking </font><font color="#777777">... </font><font color="#000000">the web link, and </font><font color="#777777">finding </font><font color="#000000">the page to </font><font color="#777777">get other than </font><font color="#000000">what they thought it </font><font color="#777777">will be.. If America makes</font> 

And a demo: http://rubular.com/r/mmQ4TBZb96

How to combine all the texts starting with ...... to get the desired matches above?

Thanks for the help!

+4
source share
2 answers

Even if your question seems inconsistent (I don’t understand why you will get the final desired match), I think this is what you need:

 ((<font color=\"#777777\">\.{3}) .+?(<\/font>(?=\s*\2)|$)) 

He uses the β€œlook ahead” option so that the end of the match is just before the next β€œ...” sequence (or the end of the input.

See this in rubular

+2
source

The question is about regexp, but you can also do it like this (Perl syntax, but I believe that similar functions exist in other languages):

 split(/(?=<font color=\"#777777\">\.\.\.)/, $your_text) 
0
source

Source: https://habr.com/ru/post/1489684/


All Articles