Recursive Java regular expression replacement?

I can replace ABC(10,5) with (10)%(5) using:

 replaceAll("ABC\\(([^,]*)\\,([^,]*)\\)", "($1)%($2)") 

but I cannot figure out how to do this for ABC(ABC(20,2),5) or ABC(ABC(30,2),3+2) .

If I can convert to ((20)%(2))%5 , how can I convert back to ABC(ABC(20,2),5) ?

Thanks J

+4
source share
4 answers

I am going to answer the first question. I could not complete the task in one replaceAll . I do not think this is even possible. However, if I use a loop, this should do the job for you:

  String termString = "([0-9+\\-*/()%]*)"; String pattern = "ABC\\(" + termString + "\\," + termString + "\\)"; String [] strings = {"ABC(10,5)", "ABC(ABC(20,2),5)", "ABC(ABC(30,2),3+2)"}; for (String str : strings) { while (true) { String replaced = str.replaceAll(pattern, "($1)%($2)"); if (replaced.equals(str)) { break; } str = replaced; } System.out.println(str); } 

I assume that you are writing a parser for numeric expressions, thus defining the term termString = "([0-9+\\-*/()%]*)" . He outputs this:

 (10)%(5) ((20)%(2))%(5) ((30)%(2))%(3+2) 

EDIT . As per the OP request, I add code to decode the strings. This is a bit more hacky than the direct scenario:

  String [] encoded = {"(10)%(5)", "((20)%(2))%(5)", "((30)%(2))%(3+2)"}; String decodeTerm = "([0-9+\\-*ABC\\[\\],]*)"; String decodePattern = "\\(" + decodeTerm + "\\)%\\(" + decodeTerm + "\\)"; for (String str : encoded) { while (true) { String replaced = str.replaceAll(decodePattern, "ABC[$1,$2]"); if (replaced.equals(str)) { break; } str = replaced; } str = str.replaceAll("\\[", "("); str = str.replaceAll("\\]", ")"); System.out.println(str); } 

And the result:

 ABC(10,5) ABC(ABC(20,2),5) ABC(ABC(30,2),3+2) 
+1
source

First, you can evaluate internal abbreviations first, until there are more restrictions. However, you need to take care of others,, ( and ) . @BorisStrandjev's solution is better, more bulletproof.

 String infix(String expr) { // Use place holders for '(' and ')' to use regex [^,()]. expr = expr.replaceAll("(?!ABC)\\(", "<<"); expr = expr.replaceAll("(?!ABC)\\)", ">>"); for (;;) { String expr2 = expr.replaceAll("ABC\\(([^,()]*)\\,([^,()]*)\\)", "<<$1>>%<<$2>>"); if (expr2 == expr) break; expr = expr2; } expr = expr.replaceAll("<<", ")"); expr = expr.replaceAll(">>", ")"); return expr; } 
+1
source

You can try to rewrite the string using Polish notation, and then replace % XY with ABC (X, Y) .

Here's a wiki link for polish notation.

The problem is that you need to figure out which rewrite ABC (X, Y) happened first when you recursively replace them in your line. Polish notation is useful for “deciphering” the order in which these rewrites occur and is widely used in evaluating expressions.

You can do this using the stack and the entry that replaced first: find the innermost set of brackets, push only that expression onto the stack, and then remove it from your line. If you want to restore the original expression of the expression, just run it at the top of the stack and apply the inverse transformation (X)% (Y)ABC (X, Y) .

This is a somewhat form of Polish notation, with the only difference being that you do not save the entire expression as a string, but rather save it on the stack to simplify processing.

In short, when replacing, start with the most internal terms (those that do not have brackets in them) and apply the reverse substitution.

It may be useful to use (X)% (Y)ABC {X, Y} as a rule of intermediate rewriting, and then rewrite the curly braces as parentheses. Thus, it will be easier to define what is the innermost term since the new terms will not use parentheses. It is also easier to implement, but not so elegant.

0
source

You can use this regex library https://github.com/florianingerl/com.florianingerl.util.regex , which also supports recursive regular expressions.

The conversion of ABC (ABC (20,2), 5) to ((20)% (2))% (5) is as follows:

  Pattern pattern = Pattern.compile("(?<abc>ABC\\((?<arg1>(?:(?'abc')|[^,])+)\\,(?<arg2>(?:(?'abc')|[^)])+)\\))"); Matcher matcher = pattern.matcher("ABC(ABC(20,2),5)"); String replacement = matcher.replaceAll(new DefaultCaptureReplacer() { @Override public String replace(CaptureTreeNode node) { if ("abc".equals(node.getGroupName())) { return "(" + replace(node.getChildren().get(0)) + ")%(" + replace(node.getChildren().get(1)) + ")"; } else return super.replace(node); } }); System.out.println(replacement); assertEquals("((20)%(2))%(5)", replacement); 

Convert back i.e. from ((20)% (2))% (5) to ABC (ABC (20,2), 5) is as follows:

  Pattern pattern = Pattern.compile("(?<fraction>(?<arg>\\(((?:(?'fraction')|[^)])+)\\))%(?'arg'))"); Matcher matcher = pattern.matcher("((20)%(2))%(5)"); String replacement = matcher.replaceAll(new DefaultCaptureReplacer() { @Override public String replace(CaptureTreeNode node) { if ("fraction".equals(node.getGroupName())) { return "ABC(" + replace(node.getChildren().get(0)) + "," + replace(node.getChildren().get(1)) + ")"; } else if ("arg".equals(node.getGroupName())) { return replace(node.getChildren().get(0)); } else return super.replace(node); } }); System.out.println(replacement); assertEquals("ABC(ABC(20,2),5)", replacement); 
0
source

Source: https://habr.com/ru/post/1401759/


All Articles