Perl: Why my regex doesn't match

I am trying to extract part of a string using regex. I have the following cases for a string:

case1: Warehouse.13.season01episode01.hdtv.xor.avi case2: Warehouse.13.s01e01.hdtv.xor.avi case3: Warehouse.13.01x01.hdtv.xor.avi 

delimter(.) in the line above can be replaced with \s - _ .

The logic used is checking if s or season blocked (lookbehind) by number and extract everything before it, but since look-behind needs absolute length, I canceled the line and used a look at it.

Now for case1, I created the following regex that works fine and produces Warehouse.13 . Warehouse.13 .

 .*?\d{1,2}e\d{1,2}s\.(?=\d+)(.*) 

Now for case2 I used:

 .*?\d{1,2}edosipe\d{1,2}nosaes\.(?=\d+)(.*) # works fine. 

Now when I try to combine the two above cases + an extra separator like:

 .*?\d{1,2}[e|edosipe]?[._ x\-]?\d{1,2}[s|nosaes]?[._\- ]?(?=\d+)(.*) 

In the above case, you may notice that most things are optical (?). This is for Question 3.

Using the above expression is not suitable for case2, but works fine for case1 and case3.

Any idea what is wrong here.

PS: I know there may be another possible line that will ignore the above regex, but they are not currently interested.

+4
source share
1 answer

[e|edosipe] and [s|nosaes] must be (e|edosipe) and (s|nosaes) , or (?:e|edopise) and (?:s|nosaes) if you do not want a regular mechanism expressions grabbed them and messed up your account of $1 , $2 , etc.

Here (...) braces the grouping in the same way as in any other expression in Perl. [...] defines a character class. In particular, [s|nosaes] corresponds to one character, which is either a , e , n , o , s , and (perhaps surprisingly, metacharacters are usually ignored inside [... ]), | .

+5
source

Source: https://habr.com/ru/post/1434436/


All Articles