I support this Java application where developers have implemented some RegEx-based filtering. To be as general as possible, they compile templates using the MULTILINE flag.
The other day I noticed something unexpected. In Java, the pattern "^\\s*$" does not match "" with the MULTILINE flag. It does not match this sign.
Pattern pattern = Pattern.compile("^\\s*$", Pattern.MULTILINE); Matcher matcher = pattern.matcher(""); System.out.println("Multiline: "+matcher.find()); pattern = Pattern.compile("^\\s*$"); matcher = pattern.matcher(""); System.out.println("No-multiline: "+matcher.find());
This creates the following output
Multiline: false Non-Multiline: true
The same results can be seen for matches() :
System.out.println("Multiline: " + ("".matches("(?m)^\\s*$"))); System.out.println("No-multiline: " + ("".matches("^\\s*$")));
I expect all cases to be consistent.
In Python, this is so. It:
import re print(re.search(r'^\s*$', "", re.MULTILINE)) print(re.search(r'^\s*$', ""))
gives:
<_sre.SRE_Match object; span=(0, 0), match=''> <_sre.SRE_Match object; span=(0, 0), match=''>
In Perl, both cases are the same, and I think I remember that it is the same for PHP.
I would really appreciate if anyone could explain why Java handles this case.