The behavior you see has nothing to do with the Pattern.CANON_EQ
flag.
The input read from the console does not match the Java string literal. When a user (presumably you, checking this flag) enters \u00E5
into the console, the resulting line read using console.readLine
is equivalent to "\\u00E5"
and not "å". See for yourself: http://ideone.com/lF7D1
As for Pattern.CANON_EQ
, it behaves exactly as described:
Pattern withCE = Pattern.compile("^a\u030A$",Pattern.CANON_EQ); Pattern withoutCE = Pattern.compile("^a\u030A$"); String input = "\u00E5"; System.out.println("Matches with canon eq: " + withCE.matcher(input).matches()); // true System.out.println("Matches without canon eq: " + withoutCE.matcher(input).matches()); // false
http://ideone.com/nEV1V
source share