One alternative way to solve the problem is to use pyparsing and an example of regex parsing that would extend the regex to possible matching lines. For your sample string xy{1,2} z it generates two possible strings that extend the quantifier:
$ python -i regex_invert.py >>> s = "xy{1,2} z" >>> for item in invert(s): ... print(item) ... xyz x yy z
The repetition itself supports both the open range and the closed range and is defined as:
repetition = ( (lbrace + Word(nums).setResultsName("count") + rbrace) | (lbrace + Word(nums).setResultsName("minCount") + "," + Word(nums).setResultsName("maxCount") + rbrace) | oneOf(list("*+?")) )
To get the desired result, we need to change the way we get results from the recurseList generator and return lists instead of strings:
for s in elist[0].makeGenerator()(): for s2 in recurseList(elist[1:]): yield [s] + [s2]
Then we only need to smooth out the result :
$ ipython3 -i regex_invert.py In [1]: import collections In [2]: def flatten(l): ...: for el in l: ...: if isinstance(el, collections.Iterable) and not isinstance(el, (str, bytes)): ...: yield from flatten(el) ...: else: ...: yield el ...: In [3]: s = "xy{1,2} z" In [4]: for option in invert(s): ...: print(list(flatten(option))) ...: ['x', ' ', 'y', None, ' ', 'z'] ['x', ' ', 'y', 'y', ' ', 'z']
Then, if necessary, you can filter out whitespace characters:
In [5]: for option in invert(s): ...: print([item for item in flatten(option) if item != ' ']) ...: ['x', 'y', None, 'z'] ['x', 'y', 'y', 'z']