Why this regex doesn't match the last match

I have a list of renewed animation data with the most characteristic formats:

* » iddle 1-210 * » run01 215-252 * » stand up 876-987 0 - = bindpose 1 - 48 = idle 118 - 150 = attack_idle 151 - 192 = attack 1 791 - 815 = strafe right 000 - 009 T-pose 010 - 040 walk 045 - 075 walk-back 080 - 110 walk-right-45 490 - 590 idle-1 1060 - 1120 spell-cast_01 1515 - 1590 sack_pick_up 

I understand how to combine animation names ...

I made this matching pattern,

  ([a-zA-Z][\w- _]+) 

He returns

 1: iddle 1-210 1: run01 215-252 1: stand up 876-987 1: bindpose 1: idle 1: attack_idle 1: attack 1 1: strafe right 1: T-pose 1: walk 1: walk-back 1: walk-right-45 1: idle-1 1: spell-cast_01 1: sack_pick_up 

To avoid having three start matches contain numbers, I tried this:

  ([a-zA-Z][\w- _]+)(?:\s\d+\s*[-]*\s*\d\s*) 

but it does not match the last line:

 1: iddle 1: run01 1: stand up 1: bindpose 1: idle 1: attack_idle 1: attack 1 1: strafe right 1: T-pose 1: walk 1: walk-back 1: walk-right-45 1: idle-1 1: spell-cast_01 

Why?

I think this is due to (? = \ S , but I have not found how to fix it ...

EDIT: Fixed bug '|' between brackets

+4
source share
3 answers

Use this regex to write only names in group1

 ^.*?([a-zA-Z][\w -]+?)(?:(?:\s*\d+-\d+)?)$ 

Use multiline mode


Errors in your regex

_ in [\w- _] not required since \w contains _

\w- in [\w- _] incorrect, because you specify a range between \w and a space.

It must be [\w -] , because - when used in start or end, the character class does not really matter

0
source

Use regex pattern

 [a-zA-Z][\w-]*(\s+(?:[a-zA-Z]|\d(?!\d*-))[\w-]*)* 
+1
source

I think all of your other lines may correspond to line breaks with \s and 1 on the next line with \d+ , which is not possible on the last line. Here is another option:

If you use a multi-line parameter (in C # you can provide a matching function with the RegexOptions.Multiline parameter), $ will match the end of the line. Then you can do something like this:

 ([a-zA-Z][\w -]+)(?:\s\d+\s*-*\s*\d+)?$ 

This makes the number at the end optional, but claims that the line should end after that.

Note that I removed _ from the character class, because it is already part of \w . I also change [-] to - because it is equivalent.

+1
source

Source: https://habr.com/ru/post/1442497/


All Articles