Why am I getting extra unexpected results with my ack regex?

Finally, I am learning regular expressions and learning with ack . I believe this uses Perl regexp.

I want to match all the lines where the first non-empty if (<word> ! Characters if (<word> ! are with any number of spaces between the elements.

Here is what I came up with:

 ^[ \t]*if *\(\w+ *! 

It just almost worked. ^[ \t]* is not true because it matches one or none [space or tab]. I want to combine everything that can contain only a space or tab (or nothing).

For example, they must not match:

 // if (asdf != 0) else if (asdf != 1) 

How do I change my regex for this?


EDIT command line addition

 ack -i --group -a '^\s*if *\(\w+ *!' c:/work/proj/proj 

Pay attention to single quotes, I'm not sure about them anymore.

My search base is a large code base. It includes appropriate expressions (quite a few), but even for example:

 274: }else if (y != 0) 

which I get as a result of executing the specified command.


EDIT adding mobrule test result

Mobrule, thanks for providing me with text to test. I will copy here what I get in the tooltip:

 C:\Temp\regex>more ack.test # ack.test if (asdf != 0) # no spaces - ok if (asdf != 0) # single space - ok if (asdf != 0) # single tab - ok if (asdf != 0) # multiple space - ok if (asdf != 0) # multiple tab - ok if (asdf != 0) # spaces + tab ok if (asdf != 0) # tab + space ok if (asdf != 0) # space + tab + space ok // if (asdf != 0) # not ok } else if (asdf != 0) # not ok C:\Temp\regex>ack '^[ \t]*if *\(\w+ *!' ack.test C:\Temp\regex>"C:\Program\git\bin\perl.exe" C:\bat\ack.pl '[ \t]*if *\(\w+ *!' a ck.test if (asdf != 0) # no spaces - ok if (asdf != 0) # single space - ok if (asdf != 0) # single tab - ok if (asdf != 0) # multiple space - ok if (asdf != 0) # multiple tab - ok if (asdf != 0) # spaces + tab ok if (asdf != 0) # tab + space ok if (asdf != 0) # space + tab + space ok // if (asdf != 0) # not ok } else if (asdf != 0) # not ok 

The problem is in my call to myck.bat!

ack.bat contains:

 "C:\Program\git\bin\perl.exe" C:\bat\ack.pl %* 

Although I am calling with a carriage, it leaves when the bat file is called!

Carriage reset using ^^ does not work.

Setting a regular expression using " " instead of ' ' works. My problem was a DOS / win problem, sorry for bothering you all for this.

+4
source share
3 answers

In both ack and grep , * matches zero or more, and not zero or one. Therefore, I think that you already have the right solution. What test cases do not give you the desired results?

 # ack.test if (asdf != 0) # no spaces - ok if (asdf != 0) # single space - ok if (asdf != 0) # single tab - ok if (asdf != 0) # multiple space - ok if (asdf != 0) # multiple tab - ok if (asdf != 0) # spaces + tab ok if (asdf != 0) # tab + space ok if (asdf != 0) # space + tab + space ok // if (asdf != 0) # not ok } else if (asdf != 0) # not ok 

Results:

 $ ack '^[ \t]*if *\(\w+ *!' ack.test if (asdf != 0) # no spaces - ok if (asdf != 0) # single space - ok if (asdf != 0) # single tab - ok if (asdf != 0) # multiple space - ok if (asdf != 0) # multiple tab - ok if (asdf != 0) # spaces + tab ok if (asdf != 0) # tab + space ok if (asdf != 0) # space + tab + space ok $ ack -v '^[ \t]*if *\(\w+ *!' ack.test // if (asdf != 0) # not ok } else if (asdf != 0) # not ok 
+4
source
 ^\s*if\s*\(\S+\s*! 
  • Use \S for non-white space. \w will not match special characters, so if ($word will not match. It may be OK with your specifications, in this case \w (alphanumeric plus "_") OK
  $ perl5.8 -e '{$ s = "else if (asdf \! = 1)";  if ($ s = ~ / ^ \ s * if \ s * \ ((\ S +) \ s * \! /) {print "| $ 1 | \ n";} else {print "NO MATCH \ n";} } '
 NO MATCH
 $ perl5.8 -e '{$ s = "// if (asdf \! = 0)";  if ($ s = ~ / ^ \ s * if \ s * \ ((\ S +) \ s * \! /) {print "| $ 1 | \ n";} else {print "NO MATCH \ n";} } '
 NO MATCH
 $ perl5.8 -e '{$ s = "if (asdf \! = 0)";  if ($ s = ~ / ^ \ s * if \ s * \ ((\ S +) \ s * \! /) {print "| $ 1 | \ n";} else {print "NO MATCH \ n";} } '  
 | asdf |
 $ perl5.8 -e '{$ s = "if (asdf \! = 0)";  if ($ s = ~ / ^ \ s * if \ s * \ ((\ S +) \ s * \! /) {print "| $ 1 | \ n";} else {print "NO MATCH \ n";} } ' 
 | asdf |
 $ perl5.8 -e '{$ s = "if (\ $ asdf \! = 0)";  if ($ s = ~ / ^ \ s * if \ s * \ ((\ S +) \ s * \! /) {print "| $ 1 | \ n";} else {print "NO MATCH \ n";} } '
 | $ asdf |
+6
source

You may try:

 (?:\t*| *)if *\(\w+ *! 

.

 \t*| * 

there will be zero or more tabs or zero or more spaces, not a combination of spaces and tabs.

+1
source

Source: https://habr.com/ru/post/1306938/


All Articles