Matching Java Regular Expression Border?

I found the following question in one Java test package

    Pattern p = Pattern.compile("[wow]*");
    Matcher m = p.matcher("wow its cool");
    boolean b = false;
    while (b = m.find()) {
        System.out.print(m.start() + " \"" + m.group() + "\" ");
    }

where the output is as follows

0 "wow" 3 "" 4 "" 5 "" 6 "" 7 "" 8 "" 9 "oo" 11 "" 12 ""

Until the last match, it is clear that the pattern [wow] * greedily matches 0 or more characters 'w' and 'o', while to cancel characters, including spaces, it leads to empty lines. However, after comparing the last 'l' with 11 "the next 12" "seems unclear. There is no details in the test solution, and I couldn’t definitely figure it out from javadoc. My best guess is the border character, but I would appreciate it if could anyone give an explanation

+4
source share
3 answers

, , , . , , :

Pattern p = Pattern.compile("[wow]*"); // One of the two 'w is redundant, but the engine is OK with it
Matcher m = p.matcher("");             // Passing an empty string results in a valid match that is empty
boolean b = false;
while (b = m.find()) {
    System.out.print(m.start() + " \"" + m.group() + "\" ");
}

0 "", , .

, , , , . " " , "" . , 11, : "" . "wow its cool".substring(12): .

, . , .

+3
  • [wow]* wow. count = 1

  • - * ( ) [wow]* , , . , , . Count = 2.

  • its . , , . count 2+3=5.

  • . , . 5+1=6

  • c . , , c 6+1=7

  • oo . [wow]*. , oo, 1 . 7+1=8 .

  • l . Count = 9

  • , . , 9+1=10

  • , , , m.start() .

DEMO

+3

, . 12 "" - , , , , . .

, , , , , ( $).

In other words, without testing the end of the last character, this would mean that matches will never appear relative to the end of the line, but there are many regular expression constructs that match the end of the line (and you showed one of them here).

0
source

Source: https://habr.com/ru/post/1569126/


All Articles