How can I split a string into groups?

I am trying to figure out how to split a string into groups. I do not think that the split(regex) method will be sufficient for it.

I have String complexStatement = "(this && that)||(these&&those)||(me&&you)"; , and I would like the array to look like this:

 "(this && that)","(these&&those)","(me&&you)"" 

If I had "(5+3)*(2+5)+(9)" , I would like to have "(5 + 3)", "(2 + 5)", "(9)".
(bonus points, if you can somehow save connection information, for example *,+,|| )

Is this possible for arbitrary line input? I play with StringTokenizer, but I have not had time to deal with it yet.

+4
source share
3 answers

You can use the following code:

  String str = "(this && that)\",\"(these&&those)\",\"(me&&you)"; Pattern pattern = Pattern.compile("\\(([^\\)]+)\\)"); Matcher m = pattern.matcher(str); while (m.find()){ System.out.println(m.group(0)); } 

\\(([^\\)]+)\\) dig you something in the parenthesis, look what you want!

Edit:

To capture content between ) and ( , just replace the regular expression with \\)([^\\(]+)\\( this!

+4
source

I think you better implement parsing rather than depending on the prepared methods.

Here is my suggestion ... I assume the input format will always be like followig

 (value1+operator+value2)+operator+(value3+operator+value4)+........ 

[here the operator may be different, but + just shows concatenation).

If the above assumption is correct, you can do the following.

  • Use stack
  • When reading the source line, push all characters onto the stack
  • now pop up in turn from the stack using the following logic a. if get) start adding to line b. if get (add to the line, and now you will get one token. add the token to the array. s. after receiving (skip to the next).

NB is simple and pseudo-code with primitive thinking.

+2
source

If you want to capture groups defined only by brackets at the outermost level, you are outside the world of regular expressions and you will need to parse the input. The StinePike method is good; the other (in promiscuous pseudo-code) is as follows:

 insides = [] outsides = [] nesting_level = 0 string = "" while not done_reading_input(): char = get_next_char() if nesting_level > 0 or char not in ['(', ')']: string += char if char == '(' if nesting_level == 0: outsides.add(string) string = "" nesting_level += 1 elif char == ')': nesting_level -= 1 if nesting_level == 0: insides.add(string) string = "" 

If the very first character at your input is '(', you will get an extra line in the outsides array, but you can fix it without much trouble.

If you are interested in nested parentheses, then you will not create only two arrays as output; you will need a tree.

+1
source

Source: https://habr.com/ru/post/1488883/


All Articles