Regular expression matching all but string

I need to find all lines matching the pattern, except for two given lines.

For example, find all letter groups except aaand bb. Starting from this line:

-a-bc-aa-def-bb-ghij-

Must return:

('a', 'bc', 'def', 'ghij')

I tried with this ordinary expression that captures 4 lines. I thought I was getting closer, but (1) it does not work in Python and (2) I cannot figure out how to exclude multiple lines from the search. (Yes, I could remove them later, but my real regex does everything in one shot, and I would like to include this last step in it.)

I said that it does not work in Python, because I tried this, expecting exactly the same result, but instead I get only the first group:

>>> import re
>>> re.search('-(\w.*?)(?=-)', '-a-bc-def-ghij-').groups()
('a',)

, .

+4
3

.

,

>>> re.findall(r'-(?!aa|bb)([^-]+)', string)
['a', 'bc', 'def', 'ghij']

  • - -

  • (?!aa|bb) , , - aa bb

  • ([^-]+) , -


Edit

, aa bb, , -aabc-. , - ,

>>> re.findall(r'-(?!aa-|bb-)([^-]+)', string)
+6

lookahead re.findall, .

res = re.findall(r'-(?!(?:aa|bb)-)(\w+)(?=-)', s)

- , , [^-]:

res = re.findall(r'-(?!(?:aa|bb)-)([^-]+)(?=-)', s)

regex demo.

  • - -
  • (?!(?:aa|bb)-) - aa- bb-,
  • (\w+) - 1 ( re.findall), 1 [^-]+ - 1 , -
  • (?=-) - chars -. , ( ).

- Python:

import re
p = re.compile(r'-(?!(?:aa|bb)-)([^-]+)(?=-)')
s = "-a-bc-aa-def-bb-ghij-"
print(p.findall(s)) # => ['a', 'bc', 'def', 'ghij']
+2

, , python, : :

input_list = "-a-bc-aa-def-bb-ghij-"
exclude = set(["aa", "bb"])
result = [s for s in input_list.split('-')[1:-1] if s not in exclude]

This solution has the additional advantage that it resultcan also be converted to a generator, and the list of results does not need to be explicitly built.

0
source

Source: https://habr.com/ru/post/1655227/


All Articles