Extra regular expression match

Trying to match these input strings with three comparable groups ( link Regex101 ):

    | input string  | x  | y   | z  |
------------------------------------
  I | a             | a  |     |    |
 II | a - b         | a  | b   |    |
III | a - b-c       | a  | b-c |    |
 IV | a - b, 12     | a  | b   | 12 |
  V | a - 12        | a  |     | 12 |
 VI | 12            |    |     | 12 |

So, the anatomy of the input lines is as follows:

  • optional first part with free text before hyphenwith surrounding space ( -) or the input line ends
  • optional second part with any character after the first hyphen with a surrounding space until commaor until the end of the input ends
  • optionally exactly two digits at the end

I tried many different solutions, this is my current attempt:

^(?P<x>.*)(?:-)(?P<y>.*)(?<!\d)(?P<z>\d{0,2})(?!\d)$

It handles the scripts II, IVand VOK (it should also execute some piece of space):

  • Iand VIdo not return at all
  • III ,
+4
3

:

^(?:(.*?)(?: - |$))?(?:(.*?)(?:, |$))?(\d\d$)?$

, , 1, 2 3 .

, " "

  • 2 V
  • 1 VI,

.

, " " " " , .

, , " " 1 2, . :

^(?:((?!\d\d$).*?)(?: - |$))?(?:((?!\d\d$).*?)(?:, |$))?(\d\d$)?$

:

^                    # string starts
(?:(.*?)(?: - |$))?  # any text, reluctantly, and " - " or the string ends
(?:(.*?)(?:, |$))?   # any text, reluctantly, and ", " or the string ends
(\d\d$)?             # two digits and the string ends
$                    # string ends
+5

verbose , , :

^(?P<x>(?!\d\d$)(?:(?! - ).)*)?(?: - (?P<y>(?!\d\d$)[^,\n]*)?(?:, )?)?(?P<z>\d\d)?$

^                   # assert start of string/line
(?P<x>              # capture in group "x"
    (?!\d\d$)       # if the whole string is just two digits, don't capture them in group x
    (?:             # as long as...
        (?! - )     # ...we don't come across the text " - "...
        .           # ...consume the next character
    )*
)?                  # make group x optional
(?:                 # if possible...
     -              # consume the " - " separator
    (?P<y>          # then capture group "y"
        (?!\d\d$)   # again, only if this isn't two digits which belong in group z
        [^,\n]*     # consume everything up to a comma
    )?              # group y is also optional
    (?:, )?         # consume the ", " separator, if present
)?
(?P<z>              # finally, capture in group "z"...
    \d\d            # ...two digits...
)?                  # ...if present
$                   # assert end of string
+3

: , :

^
    (?:(?P<x>\D*?)(?=(?:\ -\ |$)))?
    (?:.*?(?<=\ -\ )(?P<y>[^\d,]+)(?=,|$))?
    (?:.*?(?P<z>\d{2}$))?
$

regex101.com ( verbose [aka x] multiline [aka m]):


:
^                       # start of the line
    (?:                 # non capturing parentheses
        (?P<x>\D*?)     # no digits lazily ...
        (?=\ -\ |$)     # up until either " - " or end of string
    )?                  # optional
    (?:
        .*?             # match everything lazily
        (?<=\ -\ )      # pos. lookbehind
        (?P<y>[^\d,]+)  # not a comma or digit
        (?=,|$)         # up until a comma or end of string
    )?
    (?:
        .*?
        (?P<z>\d{2}$)   # two digits at the end
    )?
$
+2

Source: https://habr.com/ru/post/1674686/


All Articles