Preg_match does not work in php> 5.3

I am not good at regular expression, so I don’t even know what this does, namely:

echo preg_match('/^(([a-zA-Z0-9\x2d]{1,63}\x2e)*[a-zA-Z0-9\x2d]{1,63}){1,254}$/', 'example12345678.com>'); 

I took it from the old version of Zend Framework 1.5, which is outdated and this regular expression is no longer represented in the latest stable version of the framework. However, his behavior is curious, because I did not find a documentary explanation or backward incompatibility in the official php resources.

The fact is that on php 5.2. * it works fine: returns 0. On php 5.3.10, 5.4.0 (most likely 5.3., 5.4) I believe that it returns FALSE, which means "error".

My question is why? and what is the mistake? Is it a regular expression, some kind of recursion or ambiguity of rules? Why does it work on php 5.2 if so?


Interestingly, if I changed the example 12345678.com> to 'example1234567.com>' (making it one or more char shorter) - it will start working and return 0. If I change it to "123123123123123123123123123", it works and returns 1.

UPD : I don’t know if this matters, but the pcre version here is 8.02 (php 5.2) vs 8.12 (php 5.3)


UPD2 . I understand what it is for ... more or less ... and there is no problem for something to work right now. As I said, updating Zend_Validate_ * solves. I will try to describe my concern in other words:

let's say I'm updating an important piece of software by turning php5.2> php5.3. I am trying to find information about all the problems that I could encounter (Basically, reading this: http://php.net/manual/en/migration53.php ). The software is a bit dated, but it is not ancient, for example. Zend Framework may be version 1.5. I check / fix / analyze and fix every bc break and obsolete function. Even my unit tests work fine.

To my surprise, what is described in the question is happening. (To be precise, Zend_Validate_Hostname throws an exception). So now I want to know why I skipped this during the upgrade and, more importantly, should I double-check all the “preg_match” (and other PCRE usage features) in the application that are trying to use different imaginary input, trying to find similar “fixes” errors "".

If this is a “bug fix”. Since this seems like a new bug - it worked as expected in php5.2 and no longer works.

I was hoping to get some tips to narrow my search.

+6
source share
1 answer

This is an ugly regex. The problem is that there are too many ways a string can correspond, and therefore the processor does not have enough memory, trying all of them before it finds out that it actually does not match.

In addition, it looks like he is trying to match valid domain names, which is not the case. I would replace this call with preg_match calling this function:

 function is_valid_domain_name($string) { if (strlen($string) > 253) { return false; } $label = '(?!-)[a-zA-Z0-9-]{0,63}(?<!-)'; return preg_match("/^(?:$label\.){0,126}$label$/", $string); } 

In your problem line, it does not work fast:

 echo is_valid_domain_name('example12345678.com>'),"\n"; 
+2
source

Source: https://habr.com/ru/post/916047/


All Articles