Using the Python.match () regex method to get the string before and after the underline

Question

Using the Python.match () regex method to get the string before and after the underline

I have the following code ...

tablesInDataset = ["henry_jones_12345678", "henry_jones", "henry_jones_123"]

for table in tablesInDataset:
    tableregex = re.compile("\d{8}")
    tablespec = re.match(tableregex, table)

    everythingbeforedigits = tablespec.group(0)
    digits = tablespec.group(1)

My regex should only return a string if it contains 8 digits after the underscore. When it returns the string, I want to use .match()to get two groups using the method .group(). The first group must contain a string that will contain all the characters before the digits, and the second should contain a string with 8 digits.

Can someone please help me figure out the right regex to use the results I'm looking for using .match()and .group()?

+4

python iterator string regex

Erik Åsland Aug 25 '16 at 19:30

4

:

>>> import re
>>> pat = re.compile(r'(?P<name>.*)_(?P<number>\d{8})')
>>> pat.findall(s)
[('henry_jones', '12345678')]

, :

>>> match = pat.match(s)
>>> match.groupdict()
{'name': 'henry_jones', 'number': '12345678'}

+5

wim 25 . '16 19:33

I think that this pattern should match what you need: (.*?_)(\d{8}).

The first group includes all up to 8 digits, including an underscore. The second group is 8 digits.

If you do not want to use underscore, use instead: (.*?)_(\d{8})

+2

John Aug 25 '16 at 19:33

source share

Here you go:

import re

tablesInDataset = ["henry_jones_12345678", "henry_jones", "henry_jones_123"]
rx = re.compile(r'^(\D+)_(\d{8})$')

matches = [(match.groups()) \
            for item in tablesInDataset \
            for match in [rx.search(item)] \
            if match]
print(matches)

Better than any dot-star-soup :)

+1

Jan Aug 25 '16 at 19:50

source share

Danielle M. · Accepted Answer · 2016-08-25T19:33:40+0000

tableregex = re.compile("(.*)_(\d{8})")

Using the Python.match () regex method to get the string before and after the underline

More articles: