Help creating a regex

I need to create a regular expression that will find the word "int" only if it is not part of some string.

I want to find if int is used in code. (not on some line, only in regular code)

Example:

int i; // the regex should find this one. String example = "int i"; // the regex should ignore this line. logger.i("int"); // the regex should ignore this line. logger.i("int") + int.toString(); // the regex should find this one (because of the second int) 

thanks!

+6
source share
5 answers

This will not be bulletproof, but it works for all of your test cases:

 (?<=^([^"]*|[^"]*"[^"]*"[^"]*))\bint\b(?=([^"]*|[^"]*"[^"]*"[^"]*)$) 

He looks and looks forward to claim that there is neither one nor two previous / next quotes "

Here's the code in java with the output:

  String regex = "(?<=^([^\"]*|[^\"]*\"[^\"]*\"[^\"]*))\\bint\\b(?=([^\"]*|[^\"]*\"[^\"]*\"[^\"]*)$)"; System.out.println(regex); String[] tests = new String[] { "int i;", "String example = \"int i\";", "logger.i(\"int\");", "logger.i(\"int\") + int.toString();" }; for (String test : tests) { System.out.println(test.matches("^.*" + regex + ".*$") + ": " + test); } 

Output (included regular expression so you can read it without all these \ screens):

 (?<=^([^"]*|[^"]*"[^"]*"[^"]*))\bint\b(?=([^"]*|[^"]*"[^"]*"[^"]*)$) true: int i; false: String example = "int i"; false: logger.i("int"); true: logger.i("int") + int.toString(); 

Using a regular expression will never be 100% accurate - you need a language parser. Consider the escaped quotation marks in the strings "foo\"bar" , the comments in the string /* foo " bar */ , etc.

+4
source

It’s not clear what your complete requirements are, but

 $\s*\bint\b 

perhaps,

0
source

Assuming the input will be each line,

 ^int\s[\$_a-bA-B\;]*$ 

follows the basic rules for naming variables :)

0
source

If you think you need to parse the code and look for the isolated word int, this works:

 (^int|[\(\ \;,]int) 

You can use it to search for int, which in the code may be preceded by a space, comma, ";" and the left bracket or will be the first word of the line.

You can try it here and improve it http://www.regextester.com/

PS: this works in all your test cases.

0
source

$ [^ "] * \ bandage \ b p>

must work. I can not imagine a situation where you can use a valid int identifier after the character `` ''. Of course, this only applies if the code is limited to one statement per line.

0
source

Source: https://habr.com/ru/post/891389/


All Articles