This is essentially the expected behavior of both. The scanner primarily takes care of dividing items into tokens using your separator. Therefore, it (lazily) takes your sourceString and treats it as the following set of tokens: \r , \n , \n , \r , \r , \n and \n . When you call hasNext, it checks to see if the next token matches your pattern (which they all do trivially with ? On \r\n? ). Thus, the while loop iterates over each of the 7 tokens.
On the other hand, the match will match the regular expression greedily - so it concatenates \r\n together as you expect.
One way to emphasize Scanner behavior is to change your regular expression to (\\r\\n|\\n) . This results in a count of 0. This is because the scanner reads the first token as \r (not \r\n ), and then notices that it does not match your pattern, so it returns false when calling hasNext .
(Short version: scanner markers that use a separator before using your marker template do not have any form of tokenization)
Chris source share