This will not be bulletproof, but it works for all of your test cases:
(?<=^([^"]*|[^"]*"[^"]*"[^"]*))\bint\b(?=([^"]*|[^"]*"[^"]*"[^"]*)$)
He looks and looks forward to claim that there is neither one nor two previous / next quotes "
Here's the code in java with the output:
String regex = "(?<=^([^\"]*|[^\"]*\"[^\"]*\"[^\"]*))\\bint\\b(?=([^\"]*|[^\"]*\"[^\"]*\"[^\"]*)$)"; System.out.println(regex); String[] tests = new String[] { "int i;", "String example = \"int i\";", "logger.i(\"int\");", "logger.i(\"int\") + int.toString();" }; for (String test : tests) { System.out.println(test.matches("^.*" + regex + ".*$") + ": " + test); }
Output (included regular expression so you can read it without all these \ screens):
(?<=^([^"]*|[^"]*"[^"]*"[^"]*))\bint\b(?=([^"]*|[^"]*"[^"]*"[^"]*)$) true: int i; false: String example = "int i"; false: logger.i("int"); true: logger.i("int") + int.toString();
Using a regular expression will never be 100% accurate - you need a language parser. Consider the escaped quotation marks in the strings "foo\"bar" , the comments in the string /* foo " bar */ , etc.
source share