Vim Regex: how to search for A AND B NOT C

I have many lines containing the names of US Presidents Carter, Bush, Clinton, Obama. Some of them contain 1 of these names, some 2, some 3, some of all 4 (in any order).

I Know How To Look For Carter And Clinton And Obama β†’

:g/.*Carter\&.*Clinton\&.*Obama/p 

I Know How To Look For Carter And (Clinton OR Bush) β†’

 :g/.*Carter\&\(.*Clinton\|.*Bush\)/p 

(There are definitely better ways to do this)

But I cannot understand how to search (and I considered related issues), for example, for Bush and Clinton NOT Carter and even less than how to search, for example, for Bush and Clinton NOT (Carter OR Obama).

+43
vim regex
07 Oct '10 at 16:59
source share
2 answers

To represent NOT, use the negative statement \@! .

For example, "NOT Bush" would be:

 ^\(.*Bush\)\@! 

or using \v :

 \v^(.*Bush)@! 

Important: pay attention to the beginning ^ . Although this is not necessary if you use only positive statements (one match is as good as any other), you need to bind negative statements (otherwise they may still match at the end of the line).

Translation of "Bush and Clinton AND NOT (Carter OR Obama)":

 \v^(.*Bush)&(.*Clinton)&(.*Carter|.*Obama)@! 

Adding

Explain the relationship between \& and \@= :

 One&Two&Three 

is interchangeable with:

 (One)@=(Two)@=Three 

The only difference is that \& directly reflects \| (which should be more obvious and natural), and \@= Perl mirrors (?=pattern) .

+44
Oct 07 '10 at 17:52
source share

If you want to use Perl-style regular expressions after vim, forget about \& : this is a vim-specific feature that is useless since vim also has lookaheads, so any r1\&r2 can be rewritten as \%(r1\)\@=r2 . But looks are better since there is a negative version, and they are also available in most Perl-style regex engines. Yours (Bush AND Clinton AND NOT (Carter OR Obama)) can be expressed as follows:

 g/^\%(.*\%(Carter\|Obama\)\)\@!\%(.*Bush\)\@=.*Clinton/ 

Or, with a lot of magic:

 g/^\v%(.*%(Carter|Obama))@!%(.*Bush)@=.*Clinton/ 

See :h /\@=

On the internal logic: look-ahead is like branches: for the regular expression (reg1)@=reg2 , assuming that reg2 matches at position N (the match starts at position N ), the regex mechanism checks to see if reg1 this position. If this is not the case then the position is discarded and the regex engine tries to make the next possible match for reg2 . The same for negative appearance, but with the difference that the regex engine discards a position if it matches reg1 .




Example:

Regex: (.b)@!a

String: aba .

  • Found a match: a matches at position 0 ( a ba ). Attempt to match forecast:. corresponds to a ( a ba ) and b corresponds to b ( a b a ), coinciding with the lead, dropping the position.
  • Position 1 ( a b a ) does not correspond to a .
  • Found a match: a corresponds to position 2 ( ab a ). Attempt to match the forecast "forward":. matches a ( ab a ), but b does not match: there are no characters left, forward search is not performed. Result: the regular expression matches position 2.
+14
Oct 07 '10 at 17:41
source share



All Articles