Capture Group with Extra Subline

I work with data of the following form (four examples given, each of which is separated by a new line):

some publication, issue no. 3
another publication, issue no. 23
yet another publication
here is another publication

I need to extract the name of the publication and, if it exists, the problem number. This must be done using regular expression.

Therefore, given the above data, I am looking for the following results:

some publication            3
another publication         23
yet another publication     <null>
here is another publication <null>

The following template only works for data that has a part , issue no. xyz:

    String underTest = "some publication, issue no. 3";

    String pattern = "(.*?), issue no. (\\d+)";
    Matcher matcher = Pattern.compile(pattern).matcher(underTest);

    boolean found = matcher.find();
    if (found) {
        log.info("something found");
        String group1 = matcher.group(1);
        log.info("group1: {}", group1);

        String group2 = matcher.group(2);
        log.info("group2: {}", group2);
    }

Any ideas for a regex string that will work for both cases (with and without a problem number)?

+4
source share
1 answer

Use an optional non-capture group around the optional part:

(.*?)(?:, issue no\. (\d+))?
     ^^^                  ^^ 

regex

:

String pattern = "(.*?)(?:, issue no\\. (\\d+))?";

, , Matcher#matches(), Matcher#find().

+4

Source: https://habr.com/ru/post/1671706/


All Articles