Regex: How to do this? (nested group within repeating group)

How can I solve this Java regex problem?

Input:

some heading text... ["fds afsa","fwr23423","42df f","1a_4( 211@ #","3240acg!g"] some trailing text....

Problem: I would like to grab everything between double quotes. (Example: fds afsa, fwr23423, etc.)

I tried the following template:

\[(?:"([^"]+)",?)+\]

But when executing Matcher.find (), this will lead to a StackOverflowError when using larger input (but works for small input, this is a Java error). And even if that works, then matcher.group (1) will only give "3240acg! G".

How can I solve this problem? (Or are you using multiple patterns in which the first pattern breaks the brackets?)

+6
source share
2 answers

Get the line between [ ] and then split by comma. It is much simpler.

+1
source

Three suggestions:

If the lines are enclosed only between brackets, then you do not need to check them at all and just use "[^"]*" as your regular expression and find all matches (provided there are no escaped quotes).

If this does not work, because the lines may occur in other places where you do not want to write them, do it in two stages.

  • Corresponds to \[[^\]]*\] .
  • Find all occurrences of "[^"]*" as a result of the first match. Or even use the JSON parser to read this line.

The third possibility, a little cheating:

Search for "[^"\[\]]*"(?=[^\[\]]*\]) . This will match the line only if the next bracket following it is a closing bracket. Restriction: Do not inside the lines parentheses are allowed.I find this ugly, especially if you look at how it will look in Java:

 List<String> matchList = new ArrayList<String>(); Pattern regex = Pattern.compile("\"[^\"\\[\\]]*\"(?=[^\\[\\]]*\\])"); Matcher regexMatcher = regex.matcher(subjectString); while (regexMatcher.find()) { matchList.add(regexMatcher.group()); } 

Do you think anyone who looks at this after a few months might say what he is doing?

+1
source

Source: https://habr.com/ru/post/901212/


All Articles