I have the following code:
public static void main(String[] args){
StringBuilder content = new StringBuilder("abcd efg h i. - – jk(lmn) qq zz.");
String patternSource = "[.-–]($| )";
Pattern pattern = Pattern.compile(patternSource);
Matcher matcher = pattern.matcher(content);
System.out.println(matcher.replaceAll(""));
}
where the patternSource character class consists of a period, minus, and \ u2013 character (something like a long dash). When executed in
abcefi- jk(lmn) qzz
If I change the order of the characters in my character class in any way, it starts working fine and gives
abcd efg h i jk(lmn) qq zz
What the heck?
Tested under JDK / JRE 1.6.0_23
source
share