Regular expression for determining sequential numbers - does not work for input without entering text

Hi, I have this code that checks for 5 or more consecutive numbers:

if (preg_match("/\d{5}/", $input, $matches) > 0)
return true;

It works great for input, which is English, but it works when the input string contains Arabic / multibyte type characters - it sometimes returns true even if there are no numbers in the input text.

Any ideas?

+3
source share
3 answers

You seem to be using PHP.

Do it:

if (preg_match("/\d{5}/u", $input, $matches) > 0)
return true;

Note the 'u' modifier at the end of the expression. It tells preg_ * to use Unicode mode for matching.

+6
source

, UTF-8.

php PCRE UTF-8.

(*UTC8) . :

/(*UTF8)[[:alnum:]]/, é, TRUE

/[[:alnum:]]/, é, FALSE.

http://www.pcre.org/pcre.txt, UTF-8 PCRE.

0

UTF-8 , \d [[:digit:]], ASCII. -ASCII-, Unicode, \p{Nd}:

$s = "12345\xD9\xA1\xD9\xA2\xD9\xA3\xD9\xA4\xD9\xA5";
preg_match_all('~\p{Nd}{5}~u', $s, $matches);

ideone.com

If you need to match specific characters or ranges, you can use the escape sequence \x{HHHH}with the corresponding code points:

preg_match_all('~[\x{0661}-\x{0665}]{5}~u', $s, $matches);

... or use the form \xHHto enter UTF-8 encoded byte sequences:

preg_match_all("~[\xD9\xA1-\xD9\xA5]{5}~u", $s, $matches);

Note that for the last example, I switched to double quotes. Form \p{}and \x{}were transferred for processing by the compiler regex, but this time we want to PHP compiler extended escape-sequence. This does not occur in single quotes.

0
source

Source: https://habr.com/ru/post/1783993/


All Articles