Convert an integer numeric interval to a regular expression

SO

I am looking for a solution to the problem - how to convert an integer interval to a regular expression. Suppose I have two numbers, Aand B. Both are positive integers and letA < B

Now I'm looking for an algorithm (maybe code) that will result in a single regular expression that will match the numbers between Aand B(including the borders). For example, I have A=20, B=35then the correct regular expression ^2[0-9]$|^3[0-5]$- as he will only match numbers 20..35.

In the general case, when A- it is something like 83724, but B- it is something like 28543485, it is not so obvious.

update Basically, this is a matter of curiosity. I know a better way to do this: return result:A<=X && X<=B

+3
source share
3 answers

Why use regex in this situation?

I would just do this:

boolean isBetween = num > A && num < B;

(code is written in Java)

Much simpler, a regular expression like what you ask for can be huge, and using it in this situation will be pointless and ineffective.

Good luck.

If you really insist on using RegEx for this task, see this website , run the regex with verbose mode, and it will explain to you how RegEx works.

+4
source

, . , . .

, , , | - . .

. A B . 6-7 [0-9][1-9]{5,6}. , ( A ):

  • S - A.
  • f - S, g=f+1 n be (digits of S)-1
  • , , f: [g-9][0-9]{n}
  • , f: f(recursive call starting from step 2, with S=the rest of digits of S)

, A=123 - ( "" ):

([2-9][0-9]{2}) | (1(([3-9][0-9]{1}) | (2(([4-9]) | 3))) )
+2

( PHP):

class Converter
{
    const REGEXP_OR     = '|';
    const REGEXP_START  = '^';
    const REGEXP_END    = '$';

    protected $sStart;
    protected $sEnd;
    function __construct($mStart, $mEnd=null)
    {
        if(is_array($mStart) && count($mStart)>1)
        {
            $this->sStart = (string)($mStart[0]);
            $this->sEnd   = (string)($mStart[1]);
        }
        else
        {
            $this->sStart = (string)($mStart);
            $this->sEnd   = (string)($mEnd);
        }
        if((int)($mStart)>(int)($mEnd))
        {
            $this->sStart = $this->sEnd = null;
        }
    }

    public function getRegexp()
    {
        return self::REGEXP_START.$this->_get_regexp_by_range($this->sStart, $this->sEnd).self::REGEXP_END;
    }

    protected function _get_regexp_by_range($sStart, $sEnd, $sOr=self::REGEXP_OR, $sFrom=self::REGEXP_START, $sTill=self::REGEXP_END)
    {
       if(!isset($sStart) || !isset($sEnd))
       {
           return null;
       }
       if((int)($sStart)>(int)($sEnd))
       {
          return null;
       }
       elseif($sStart==$sEnd)
       {
          return $sStart;
       }
       elseif(strlen($sEnd)>strlen($sStart))
       {
          $rgRegexp  = array($this->_get_regexp_by_range($sStart, str_repeat('9', strlen($sStart))));
          for($i=strlen($sStart)+1; $i<strlen($sEnd)-1; $i++)
          {
             $rgRegexp[] = $this->_get_regexp_by_range('1'.str_repeat('0', $i), str_repeat('9', $i+1));
          }
          $rgRegexp[] = $this->_get_regexp_by_range('1'.str_repeat('0', strlen($sEnd)-1), $sEnd);
          return join($sTill.$sOr.$sFrom, $rgRegexp);
       }
       else
       {
          $rgRegexp   = array();
          for($iIntersect=0;$iIntersect<strlen($sStart);$iIntersect++)
          {
             if($sStart[$iIntersect]!=$sEnd[$iIntersect])
             {
                break;
             }
          }
          if($iIntersect)
          {
             return join($sTill.$sOr.$sFrom, array_map(function($sItem) use ($iIntersect, $sStart)
             {
                return substr($sStart, 0, $iIntersect).$sItem;
             }, explode($sTill.$sOr.$sFrom, $this->_get_regexp_by_range(substr($sStart, $iIntersect), substr($sEnd, $iIntersect)))));
          }
          else
          {
             $rgRegexp = array($sStart);
             for($iPos=strlen($sStart)-1; $iPos>0; $iPos--)
             {
                if($sStart[$iPos]+1<10)
                {
                   $rgRegexp[]=substr($sStart, 0, $iPos).'['.($sStart[$iPos]+1).'-'.'9'.']'.str_repeat('[0-9]', strlen($sStart)-$iPos-1);
                }
             }
             if(($sStart[0]+1)<($sEnd[0]-1))
             {
                $rgRegexp[]='['.($sStart[0]+1).'-'.($sEnd[0]-1).']'.str_repeat('[0-9]', strlen($sStart)-1);
             }
             elseif((int)($sStart[0])+1==(int)($sEnd[0])-1)
             {
                $rgRegexp[]=($sStart[0]+1).str_repeat('[0-9]', strlen($sStart)-1);
             }
             for($iPos=1; $iPos<strlen($sEnd); $iPos++)
             {
                if($sEnd[$iPos]-1>=0)
                {
                  $rgRegexp[]=substr($sEnd,0, $iPos).'['.'0'.'-'.($sEnd[$iPos]-1).']'.str_repeat('[0-9]', strlen($sEnd)-$iPos-1);
                }
             }
             $rgRegexp[]=$sEnd;
             return join($sTill.$sOr.$sFrom, $rgRegexp);
          }
       }
    }
}

then it gets the correct results with any rows, but I think the resulting result is not the best.

$sPattern = (new Converter('1', '1000000000'))->getRegexp();
var_dump(
   preg_match('/'.$sPattern.'/', '10000000000'), 
   preg_match('/'.$sPattern.'/', '100000000'));

Anyway, thanks to everyone who answered.

+2
source

Source: https://habr.com/ru/post/1546816/


All Articles