How to match part of a string only if it is not preceded by certain characters?

I created the following regex pattern in an attempt to match a string with a length of 6 characters ending in either "PRI" or "SEC" if the string = "SIGSEC". For example, I want to match ABCPRI, XYZPRI, ABCSEC and XYZSEC, but not SIGSEC.

(\w{3}PRI$|[^SIG].*SEC$) 

These are very close and different works (if I switch to "SINSEC", it returns a partial match to "NSEC"), but I have no good idea about this in its current form. In addition, I may need to add additional exceptions, except for “SIG” later, and realize that this probably won't scale too much. Any ideas?

By the way, I am using System.Text.RegularExpressions.Regex.Match () in C #

Thanks Rich

+4
source share
8 answers

Assuming your regex engine supports negative images, try the following:

 ((?!SIGSEC)\w{3}(?:SEC|PRI)) 

Edit: The commentator noted that .NET supports negative images, so this should work fine (thanks, Charlie).

+6
source

To help break Dan's answer (correct), here's how it works:

 ( // outer capturing group to bind everything (?!SIGSEC) // negative lookahead: a match only works if "SIGSEC" does not appear next \w{3} // exactly three "word" characters (?: // non-capturing group - we don't care which of the following things matched SEC|PRI // either "SEC" or "PRI" ) ) 

All together: ((?! SIGSEC) \ w {3} (?: SEC | PRI))

+2
source

You can try the following:

 @"\w{3}(?:PRI|(?<!SIG)SEC)" 
  • Matches the word characters
  • Corresponds to PRI or SEC (but not after SIG i.e. SIGSEC is excluded) (<?! X) y - this is a negative look back (then the corresponding two y if it does not precede th)

In addition, I may need to add more exceptions except for “SIG” later and realize that this probably will not scale too well

With my code, you can easily add more exceptions, for example, the following code excludes SIGSEC and FOOSEC

 @"\w{3}(?:PRI|(?<!SIG|FOO)SEC)" 
+1
source

Why not use more readable code? In my opinion, it is much more convenient to maintain.

 private Boolean HasValidEnding(String input) { if (input.EndsWith("SEC",StringComparison.Ordinal) || input.EndsWith("PRI",StringComparison.Ordinal)) { if (!input.Equals("SIGSEC",StringComparison.Ordinal)) { return true; } } return false; } 

or in one line

 private Boolean HasValidEnding(String input) { return (input.EndsWith("SEC",StringComparison.Ordinal) || input.EndsWith("PRI",StringComparison.Ordinal)) && !input.Equals("SIGSEC",StringComparison.Ordinal); } 

Not that I did not use regular expressions, but in this case I would not use them.

+1
source

Personally, I would be inclined to create an exception list using the second variable and then include it in the full expression - this is the approach I used in the past when I need to build any complex expression.

Something like exclude = 'someexpression'; prefix = 'list of prefixes'; suffix = 'list of suffixes'; expression = '{prefix}{exclude}{suffix}'; exclude = 'someexpression'; prefix = 'list of prefixes'; suffix = 'list of suffixes'; expression = '{prefix}{exclude}{suffix}';

0
source

"Some people, faced with a problem, think," I know, I will use regular expressions. "Now they have two problems." Jamie Zawinski

0
source

You might not even want to make regular expression exceptions. For example, if it was Perl (I don't know C #, but you can probably follow it), I would do it like this:

 if ( ( $str =~ /^\w{3}(?:PRI|SEC)$/ ) && ( $str ne 'SIGSEC' ) ) 

to be clear. It does exactly what you wanted:

  • Three words followed by PRI or SEC, and
  • This is not SIGSEC

No one says you have to force everything into one regex.

0
source

Go and get Regexbuddy with RegExBuddy.com, this is an amazingly simple tool that will help you easily find the most complex regex.

-1
source

Source: https://habr.com/ru/post/1277779/


All Articles