Why is "\ r \ n" .split ("\ r \ n") returning an empty array?

I have a regular expression "[\ r \ n \ f] +" to find the number of lines contained in a String. My code is as follows:

pattern = Pattern.compile("[\\r\\n\\f]+")
String[] lines = pattern.split(texts);

In my unit test, I have lines like this:

"\t\t\t    \r\n      \n"
"\r\n"

The result of parsing the first line is 2, however, when analyzing the second line, it becomes 0.

I thought that the second line contains 1 line, although the line is "empty" (suppose I edit a file that starts with "\ r \ n" in a text editor if the caret is placed on the second line?). Is my regex wrong for parsing strings? or am I missing something here?

Edit:

I think I will make the question more obvious:

Why

// notice the trailing space in the string
"\r\n ".split("\r\n").length == 2 // results in 2 strings {"", " "}. So this block of text has two lines.

but

// notice there no trailing space in the string 
"\r\n".split("\r\n").length == 0 // results in an empty array. Why "" (empty string) is not in the result and this block of text contains 0 lines?
+4
2

Pattern.split(CharSequence):

, . .

, . , ( ):

String[] lines = pattern.split(texts, -1);
+5

, , . wikipedia:

LF: Multics, Unix Unix- (GNU/Linux, OS X, FreeBSD, AIX, Xenix ..), BeOS, Amiga, RISC OS .

CR: Commodore 8- , Acorn BBC, ZX Spectrum, TRS-80, Apple II, Mac OS 9 OS-9

RS: QNX pre-POSIX. 0x9B: 8- Atari, ATASCII ASCII. (155 )

LF + CR: Acorn BBC RISC OS .

CR + LF: Microsoft Windows, DEC TOPS-10, RT-11 -Unix IBM, CP/M, MP/M, DOS (MS-DOS, DOS ..), Atari TOS, OS/2, Symbian OS, Palm OS, Amstrad CPC

, :

    String test = "\t\t\t    \r\n      \n";
    BufferedReader reader = new BufferedReader(new StringReader(test));
    int count = 0;
    String line=null;
    while ((line=reader.readLine()) != null) {
        System.out.println(++count+":"+line);
    }
    System.out.println("total lines == "+count);

, .ready()

0

Source: https://habr.com/ru/post/1542685/


All Articles