Continuing at the end of a previous match in RegEx (PCRE)

I'm trying to stop the anchor \ G from matching the beginning of the line. I want it to match at the end of the last regular expression match.

Given the following text:

Pig, Cow, Goat
fruit: apple, orange, peach, pear
vegetable: Carrot, Lettuce, Cellery

And this template:

(fruit:|\G)([\w]+|[\, ])

I want it to match only the words after "fruit:", but I need them to write down each word individually. If I just put + at the very end of this template, it will match all the words after “fruit:”, but it will only capture the “pear”, since each iteration + stomps on the last one.

Here is the problem. This pattern works, except that it also matches "Pig, Cow and Goat" because \ G will match the end of the last match OR the beginning of the entire line. How can I stop him from matching the beginning of the whole line?

I use PCRE in PHP, and I use Rubular.com to help me perform quick tests.

+3
source share
1 answer

In my opinion, you regular expression did not give you what you said you want. You said you would like every word "fruit:". Given your example, I don't think your first attempt really gave you this. Try:

(?:fruit:\s*|\G,\s*)(\w+)

If you juxtapose everything, it should give you words without spaces or punctuation marks.

Here is a summary:

  • (?: - Launch a non-capturing group.
  • fruit:\s* -
  • | -
  • \G,\s*) - ,
  • (\w+)

EDIT:

, , , , \G:

(?:fruit:\s*|(?<!^)\G,\s*)(\w+)
+6

Source: https://habr.com/ru/post/1782963/


All Articles