Make one or zero regex operator greedy

I have two suggestions as input. Say for example:

<span>I love my red car.</span> <span>I love my car.</span> 

Now I want to match each text part inside the span tags and if color is available.

If I use the following regex:

 /<span>(.*?)(?P<color>red)(.*?)<\/span>/ms 

Only a line with color is selected. So I thought, what can I use? -operator (for one or zero).

 /<span>(.*?)(?P<color>red)?(.*?)<\/span>/ms 

Now both lines / sentences will be matched. Unfortunately, color is no longer matched.

The question is why? Using ". *?" in front of the color part, I thought that I made the regular expression not greedy, so that the color part would match if it exists. But as said, this is not ...

+6
source share
2 answers

The first (.*?) Will match between > and I , and since it's lazy, it will immediately check the next part of the regular expression: (?P<color>red)? but there is no red at this point, so is 0 an option ? is 'activated', and the regex continues the next part, which is (.*?) . It will again match the part between > and I , and since it is lazy, it will check the next part of the regular expression: <\/span> (I take it as a whole).

So, the second (.*?) Will correspond to all that is.

In fact, your results[1] will be empty, like results[color] (I don’t remember if you need to quote color or not), and results[3] will contain I love my red car. .

Hmm, one way is to use OR, as mentioned in his NickC answer. Another that you can use is to use a negative lookup for each character:

 <span>((?:(?!\bred\b).)*(?<colour>\bred\b)?.*)<\/span> 

demo version of regex101

As a side note, I would suggest using word boundaries so that you don't match things like reduce or jarred .

+5
source

This should work:

 /<span>(.*?(?P<color>red).*?|.*?)<\/span>/ms 

Your original expression was pretty good. I changed it a little to fit the new external group to the whole proposal. I used this new external group to create an "or" condition for matching "anything" if color is missing.

Abbreviated output:

 Array [0] => Array [0] => <span>I love my red car.</span> [1] => <span>I love my car.</span> [1] => Array [0] => I love my red car. [1] => I love my car. [color] => Array [0] => red [1] => 
+2
source

Source: https://habr.com/ru/post/954040/


All Articles