How to match part of a string only if it is not preceded by certain characters?

Question

How to match part of a string only if it is not preceded by certain characters?

I created the following regex pattern in an attempt to match a string with a length of 6 characters ending in either "PRI" or "SEC" if the string = "SIGSEC". For example, I want to match ABCPRI, XYZPRI, ABCSEC and XYZSEC, but not SIGSEC.

(\w{3}PRI$|[^SIG].*SEC$)

These are very close and different works (if I switch to "SINSEC", it returns a partial match to "NSEC"), but I have no good idea about this in its current form. In addition, I may need to add additional exceptions, except for “SIG” later, and realize that this probably won't scale too much. Any ideas?

By the way, I am using System.Text.RegularExpressions.Regex.Match () in C #

Thanks Rich

+4

c # regex .net negative-lookbehind

Rich Oct 16 '08 at 2:46

source share

8 answers

To help break Dan's answer (correct), here's how it works:

 ( // outer capturing group to bind everything (?!SIGSEC) // negative lookahead: a match only works if "SIGSEC" does not appear next \w{3} // exactly three "word" characters (?: // non-capturing group - we don't care which of the following things matched SEC|PRI // either "SEC" or "PRI" ) )

All together: ((?! SIGSEC) \ w {3} (?: SEC | PRI))

+2

Charlie Oct 16 '08 at 2:59

source share

You can try the following:

 @"\w{3}(?:PRI|(?<!SIG)SEC)"

Matches the word characters
Corresponds to PRI or SEC (but not after SIG i.e. SIGSEC is excluded) (<?! X) y - this is a negative look back (then the corresponding two y if it does not precede th)

In addition, I may need to add more exceptions except for “SIG” later and realize that this probably will not scale too well

With my code, you can easily add more exceptions, for example, the following code excludes SIGSEC and FOOSEC

 @"\w{3}(?:PRI|(?<!SIG|FOO)SEC)"

+1

aku Oct 16 '08 at 2:56

source share

Why not use more readable code? In my opinion, it is much more convenient to maintain.

 private Boolean HasValidEnding(String input) { if (input.EndsWith("SEC",StringComparison.Ordinal) || input.EndsWith("PRI",StringComparison.Ordinal)) { if (!input.Equals("SIGSEC",StringComparison.Ordinal)) { return true; } } return false; }

or in one line

 private Boolean HasValidEnding(String input) { return (input.EndsWith("SEC",StringComparison.Ordinal) || input.EndsWith("PRI",StringComparison.Ordinal)) && !input.Equals("SIGSEC",StringComparison.Ordinal); }

Not that I did not use regular expressions, but in this case I would not use them.

+1

Davy landman Oct 16 '08 at 9:00

source share

Personally, I would be inclined to create an exception list using the second variable and then include it in the full expression - this is the approach I used in the past when I need to build any complex expression.

Something like exclude = 'someexpression'; prefix = 'list of prefixes'; suffix = 'list of suffixes'; expression = '{prefix}{exclude}{suffix}'; exclude = 'someexpression'; prefix = 'list of prefixes'; suffix = 'list of suffixes'; expression = '{prefix}{exclude}{suffix}';

0

warren Oct 16 '08 at 2:50

source share

"Some people, faced with a problem, think," I know, I will use regular expressions. "Now they have two problems." Jamie Zawinski

0

Matt cruikshank Oct 16 '08 at 2:55

source share

You might not even want to make regular expression exceptions. For example, if it was Perl (I don't know C #, but you can probably follow it), I would do it like this:

 if ( ( $str =~ /^\w{3}(?:PRI|SEC)$/ ) && ( $str ne 'SIGSEC' ) )

to be clear. It does exactly what you wanted:

Three words followed by PRI or SEC, and
This is not SIGSEC

No one says you have to force everything into one regex.

0

Andy lester Oct 16 '08 at 2:56

source share

Go and get Regexbuddy with RegExBuddy.com, this is an amazingly simple tool that will help you easily find the most complex regex.

-1

Toby allen Oct 16 '08 at 9:08

source share

Dan · Accepted Answer · 2008-10-16T02:50:12+0000

Assuming your regex engine supports negative images, try the following:

 ((?!SIGSEC)\w{3}(?:SEC|PRI))

Edit: The commentator noted that .NET supports negative images, so this should work fine (thanks, Charlie).

How to match part of a string only if it is not preceded by certain characters?

More articles: