I like your wired approach (for your specific needs, that is), but I would generate a template string by multiplying. In my example, groups of 3 and groups of 5 are expected (just for testing):
pattern = re.compile(r'(?:' + r'\s+'.join([ r'([a-f0-9]{2})' ] * 5) + r')|(?:' + r'\s+'.join([ r'([a-f0-9]{2})' ] * 3) + r')') m1 = pattern.match('ab cd ef') m2 = pattern.match('ab cd ef 34 56')
The result of m.groups() will look like (None, None, None, None, None, 'ab', 'cd', 'ef') for groups of 3 and something like ('ab', 'cd', 'ef', '34', '56', None, None, None) for groups of 5. Thus, you can check if m.groups()[0] None is there to find which version (45 or 48 ), and then use either groups () [: 48] or groups () [48:].
Before the lower number (45), make sure you have more (48).
This template can, of course, be used with findall , search , finditer or similar, if you have a way to find out where one group ends with a hexon, and then the next begins. In this example, the space between the hexadecimal cycles should be a space or a tab, other things (for example, new lines) separate groups of hexons from each other:
pattern = re.compile(r'(?:' + r'[ \t]+'.join([ r'([a-f0-9]{2})' ] * 5) +
β
[ ('ab', 'cd', 'ef', '34', '56', None, None, None), ('ab', 'cd', 'ef', '34', '56', None, None, None), (None, None, None, None, None, 'ab', 'cd', 'ef'), (None, None, None, None, None, 'ab', 'cd', 'ef') ]