PHP RegExp for URL string

String Examples

accuracy-is-5 accuracy-is-5-or-15 accuracy-is-5-or-15-or-20 package-is-dip-8-or-dip-4-or-dip-16 

Current current regex:

 /^([a-z0-9\-]+)\-is\-([a-z0-9\.\-]*[a-z0-9])(?:\-or\-([a-z0-9\.\-]*[a-z0-9]))*$/U 

No fixed length, part:

 \-or\-[a-z0-9\.\-] 

can be repeated.

The bot now from the line "precision-5-or -15-or-20" I get:

 Array ( [0] => accuracy-is-5-or-15-or-20 [1] => accuracy [2] => 5 [3] => 20 ) 

Where is 15? :) Tnx.

+6
source share
2 answers

When the capture group is repeated in the template, the previous values ​​are overwritten by the last. Thus, it is not possible to create your own template in this way preg_match .

A possible workaround is to use preg_match_all , which searches for all occurrences of the pattern and the \G binding, which is the position after the previous match. A pattern must be written to find one value at a time.

\G ensures that all matches are contiguous. To make sure that the end of the line is reached (in other words, the line is correctly formatted from beginning to end), a convenient way is to create an empty capture group at the end. Therefore, if this capture group is displayed in the last match, it means that the format is correct.

 define('PARSE_SENTENCE_PATTERN', '~ (?: # two possible beginings: \G(?!\A) # - immediatly after a previous match | # OR \A # - at the start of the string (?<subject> \w+ (?>[-.]\w+)*? ) -is- # (in this case the subject is captured) ) (?<value> \w+ (?>[-.]\w+)*? ) # capture the value (?: -or- | \z (?<check>) ) # must be followed by "-or-" OR the end of the string \z # (then the empty capture group "check" is created) ~x'); function parseSentence ($sentence) { if (preg_match_all(PARSE_SENTENCE_PATTERN, $sentence, $matches, PREG_SET_ORDER) && isset(end($matches)['check']) ) return [ 'subject' => $matches[0]['subject'], 'values' => array_reduce ($matches, function($c, $v) { $c[] = $v['value']; return $c; }, $c = []) ]; return false; // wrong format } // tests $test_strings = ['accuracy-is-5', 'accuracy-is-5-or-15', 'accuracy-is-5-or-15-or-20', 'package-is-dip-8-or-dip-4-or-dip-16', 'bad-format', 'bad-format-is-', 'bad-format-is-5-or-']; foreach ($test_strings as $test_string) { var_dump(parseSentence($test_string)); } 
+3
source
 ^\w+(?:-[a-zA-Z]+)+\K|\G(?!^)-(\d+)(?:(?:-[a-zA-Z]+)+|$) 

Here you can use \G to capture all groups. When you repeat the capture group, the last value overwrites the previous one. See the demo.

https://regex101.com/r/tS1hW2/3

\ G approve the position at the end of the previous match or the beginning of the line for the first match

EDIT:

 ^\w+-is(?:-dip)?\K|\G(?!^)-(\d+)(?:-or(?:-dip)?|$) 

You can use this if you are confident in is,or and dip . See the demo.

https://regex101.com/r/tS1hW2/4

 $re = "/^\\w+-is(?:-dip)?\\K|\\G(?!^)-(\\d+)(?:-or(?:-dip)?|$)/m"; $str = "accuracy-is-5\naccuracy-is-5-or-15\naccuracy-is-5-or-15-or-20\npackage-is-dip-8-or-dip-4-or-dip-16"; preg_match_all($re, $str, $matches); 
+5
source

Source: https://habr.com/ru/post/988013/


All Articles