Conditional Regular Expressions

I use Python and I want to use regular expressions to check if something is “part of the inclusion list” but “is not part of the exception list”.

My inclusion list is represented by a regular expression, for example:

And.* 

Everything that starts with AND.

The exception list is also represented by a regular expression, for example:

 (?!Andrea) 

Everything, but not Andrea’s line. The list of exceptions is obviously a negation.

Using the two examples above, I want to match everything that starts with AND, with the exception of Andrea.

In general, I have includeRegEx and excludeRegEx. I want to combine everything that matches includeRegEx, but doesn't match excludeRegEx. Note : excludeRegEx is still in negative form (as you can see in the example above), so it’s better to say: if something matches includeRegEx, I check if it also matches excludeRegEx, if it does, the match is satisfied. Can this be represented in a single regular expression?

I think conditional regular expressions may be the solution, but I'm not so sure about that.

I would like to see a working example in Python.

Many thanks.

+4
source share
1 answer

Why not put both in one regex?

 And(?!rea$).* 

Since lookahead only “looks to the future” without consuming any characters, this works just fine (well, actually, it's just a search point).

So in Python:

 if re.match(r"And(?!rea$).*", subject): # Successful match # Note that re.match always anchor the match # to the start of the string. else: # Match attempt failed 

From the wording of your question, I'm not sure if you start with two ready-made lists of match / not match pairs. In this case, you can simply combine them automatically by combining regular expressions. This works just as well, but uglier:

 (?!Andrea$)And.* 

In general:

 (?!excludeRegex$)includeRegex 
+2
source

Source: https://habr.com/ru/post/1308370/


All Articles