Regular expression for pattern matching and grouping of correct elements

I have the following lines:

//ABCD E 1234 L1
//ABCD E 1234,2345 L2
//ABCD E 4567
//ABCD E 2435,4679

To match the lines above, I wrote a file as shown below.

Pattern.compile("//\\s*ABCD\\s+E\\s+(((\\d*,\\s*\\d*)*)|(\\d*+(((,\\s*\\d*,\\s*\\d*)*)))+)\\s*(.*?)",
        Pattern.CASE_INSENSITIVE);

All of the above lines correspond to the above pattern. But when I try to get the numbers after ABCD E ie group (1) and get the label ie group (8), I get the wrong result.

// ABCD E 1234 L1 and // ABCD E 4567 gives the wrong result. group (1) is empty for both rows, and group (8) is 1234 L1 and 4567.

I suspect that (. *?) Is the culprit here, I think, but I'm not sure what else to use.

If someone knows a good template to match the lines above, please let me know.

P.S: ABCD E 1234, 2345, 456 .. L1 L2 lables. ABCD E - ( no charctaetrs) ,

+4
4

:

(\s*ABCD\s+E\s+)(?<num>[\d,]+)\s*(?<label>[A-Z]\d)?
  • (\s*ABCD\s+E\s+) - ,
  • (?<num>[\d,]+) -
  • (?<label>[A-Z]\d)? - , , ,

DEMO

3 , , num ,, . :

(?<pre>ABCD\s*E\s*)|(?<=\G),?(?<num>\d+)|(?<=\G)\s*(?<label>[A-Z]\d)
  • (?<pre>ABCD\s*E\s*) - ABCD E,
  • | -
  • (?<=\G),?(?<num>\d+) - , ,
  • | -
  • (?<=\G)\s*(?<label>[A-Z]\d) - , digit, preceeedec ,

DEMO

. , , , , , .

+1
^\s*(\/\/ABCD)\s+(E)\s+(\d+([,.]\d+)?)\s+([A-Z0-9]*)?

#0  //ABCD E 1234 L1
#1  //ABCD
#2  E
#3  1234
#4  null
#5  L1
0

, - , , , .

String input = "ABCD E 456,7689,687 M1X";

Pattern pattern = Pattern.compile("[^0-9]*\\s([0-9,]*)\\s([^0-9].*)");
Matcher queryMatcher = pattern.matcher(input);

String csvMatch = "";
String otherMatch = "";
if (queryMatcher.find()) {
    csvMatch   = queryMatcher.group(1);
    otherMatch = queryMatcher.group(2);
}
String[] matches = csvMatch.split(",");

for (String match : matches) {
    System.out.println("Found a number: " + match);
}
System.out.println("Found other: " + otherMatch);

:

Found a number: 456
Found a number: 7689
Found a number: 687
Found other: M1X

:

[^0-9]*\\s([0-9,]*)\\s([^0-9].*)

Regex101

0

If all you need is a number section and a label section, you can simplify the regex as follows:

ABCD E (\d+(?:,\d+)*)(?: (\w+))?

Legenda

ABCD E      # matches literal string 'ABCD E ', note the empty spaces
(           # open capturing group 1
 \d+        # matches 1 or more digits
 (?:,\d+)*  # non capturing group: matches comma followed by 1 or more digits. Can be zero o more time repeated
)           # close capturing group 1
(?:         # a non capturing group
            # a literal empty space ' '
 (\w+)      # capturing group 2 composed by 1 or more of [a-zA-Z0-9_]
)?          # close the non capturing group and make it optional

By adding some flexibility (replacing spaces with a tabbed group too [ \t]), the above becomes the following:

ABCD[ \t]+E[ \t]+(\d+(?:,\d+)*)(?:[ \t]+(\w+))?

Regex Demo

Link

Demo code

import java.util.regex.Matcher;
import java.util.regex.Pattern;

String ln    = System.lineSeparator();
String input = "ABCD E 1234 L1"      + ln
             + "ABCD E 1234,2345 L2" + ln
             + "ABCD E 4567"         + ln
             + "ABCD E 2435,4679"    + ln
             + "ABCD E 2435,4679,657 L6";

final Pattern labelPattern = Pattern.compile("ABCD[ \\t]+E[ \\t]+(\\d+(?:,\\d+)*)(?:[ \\t]+(\\w+))?");
Matcher m = labelPattern.matcher(input);

int line = 1;
while(m.find()) {
  System.out.print("Line " + line++ + ": ");

  if ( ! m.group(2) )
    System.out.println("'" + m.group(1) + "'"); // print number section
  else
    System.out.println("'" + m.group(1) + "', '" + m.group(2)+ "'"); // print numbers and label section
}

Exit

Line 1: '1234', 'L1'
Line 2: '1234,2345', 'L2'
Line 3: '4567'
Line 4: '2435,4679'
Line 5: '2435,4679,657', 'L6'
0
source

Source: https://habr.com/ru/post/1619142/


All Articles