Matching all numbers with regex using preg_match_all

I have text and am trying to add a link to each number of size 3 in it.
I am using preg_match_all with a pattern: (^|[^\d])(\d{3})($|[^\d])
Grouping is used here to add links only to numbers and not to their neighbors. Test examples:

  • a 123 234 b - should correspond to 123 and 234
  • a 123_234 b - must match 123 and 234
  • aa123 234 b - must match 123 and 234
  • a0123 234 b - should match only 234
  • 123a234 b - must match 123 and 234
  • a 123 234 - must correspond to 123 and 234

Tests 2 and 3 work fine, the rest fail due to a space between the two numbers.
How to combine both numbers with one space between them?

+5
source share
2 answers

You can "fix" your regular expression by simply replacing the last capture group with a positive look - (^|[^\d])(\d{3})(?=$|[^\d]) so that the matching matches match, The group ($|[^\d]) consumed the space after the three-digit fragment, and the first (^|[^\d]) could not match this space. Surley, I would replace [^\d] with \D if you prefer this approach.

I suggest using negative images, as it looks “cleaner”:

 (?<!\d)\d{3}(?!\d) ^^^^^^ ^^^^^^ 

Watch the regex demo

More details

  • (?<!\d) - the current location should not be preceded by a digit
  • \d{3} - 3 digits
  • (?!\d) - there should not be numbers to the right of the current location.
+4
source

Here are my two cents:

 \d{4,}(*SKIP)(*FAIL)|(\d{3}) 

An example of a regular expression is here .

It means:

 \d{4,}(*SKIP)(*FAIL) -> match 4 digits or more but skip the match | -> Or (\d{3}) -> match 3 digits and capture it. 

This means that your regular expression will match ONLY the occurrences of the three digits in the captured group.

Hope this helps.

EDIT :

Added (*SKIP)(*FAIL) verbs.

These two verbs force the compromise of a match. And then a replacement can be made. (See the replacement part of regex101 example.)

In php, the code would look like this:

 $arr = array( "a 123 234 b", "a 123_234 b", "aa123 234 b", "a0123 234 b", "123a234 b", "a 123 234" ); $regex = "/\d{4,}(*SKIP)(*FAIL)|(\d{3})/"; foreach ($arr as $item) { echo preg_replace($regex, '<a href="#">$1</a>', $item); echo "\r\n"; } 

Output:

 a <a href="#">123</a> <a href="#">234</a> b a <a href="#">123</a>_<a href="#">234</a> b aa<a href="#">123</a> <a href="#">234</a> b a0123 <a href="#">234</a> b <a href="#">123</a>a<a href="#">234</a> b a <a href="#">123</a> <a href="#">234</a> 
+1
source

Source: https://habr.com/ru/post/1263925/


All Articles