RegEx: A word immediately before the last open parenthesis

I know a little about RegEx, but at the moment it is much higher than my capabilities.

I need help to find the text to the last open parenthesis that does not have matching close brackets.

(This is for open source CallTip software under development.)

Below are some examples:

-------------------------- Text I need -------------------------- aaa( aaa aaa(x) '' aaa(bbb( bbb aaa(y=bbb( bbb aaa(y=bbb() aaa aaa(y <- bbb() aaa aaa(bbb(x) aaa aaa(bbb(ccc( ccc aaa(bbb(x), ccc( ccc aaa(bbb(x), ccc() aaa aaa(bbb(x), ccc()) '' -------------------------- 

Is it possible to write RegEx (PCRE) for these situations?

The best I got was \([^\(]+$ , but that’s not good, and that’s the opposite of what I need.

Can anybody help?

+2
source share
4 answers

This works correctly with all of your string examples:

 \w+(?=\((?:[^()]*\([^()]*\))*[^()]*$) 

The most interesting part:

 (?:[^()]*\([^()]*\))* 

It matches zero or more balanced pairs of parentheses along with non-paren characters before and between them (e.g. y=bbb() and bbb(x), ccc() in your example strings). When this part is completed, the final [^()]*$ ensures that there are no more parsers to the end of the line.

Remember, however, that this regular expression is based on the assumption that there will never be more than one level of nesting. In other words, they are assumed to be valid:

 aaa() aaa(bbb()) aaa(bbb(), ccc()) 

... but it is not:

 aaa(bbb(ccc())) 

The ccc(bbb(aaa( in your samples seems to imply that multi-level nesting is allowed. If so, you cannot solve your problem with just a regular expression. (Of course, some regular expressions support recursive patterns, but the syntax disgusting even with regex standards. I guarantee that you will not be able to read your own regular expression a week after writing it.)

+1
source

enter image description here

Take a look at this JavaScript feature

 var recreg = function(x) { var r = /[a-zA-Z]+\([^()]*\)/; while(x.match(r)) x = x.replace(r,''); return x } 

After applying this, you are left with all the unsurpassed parts that do not have a closing particle, and we just need the last letter word.

 var lastpart = function(y) { return y.match(/([a-zA-Z]+)\([^(]*$/); }} 

The idea is to use it as

  lastpart(recreg('aaa(y <- bbb()')) 

Then check if the result is zero, or take the appropriate group, which will be result[1] . Most regex engines do not support the ?R flag, which is required for recursive regex matching.

Note that this is an example JavaScript representation that models a recursive regular expression. Read http://www.catonmat.net/blog/recursive-regular-expressions/

+1
source

Partial solution - this assumes that your regular expression is called from a programming language that can loop.

1) select an input: find matching parentheses and delete them with the whole gap. Keep going until there is a match. The regular expression will look for ([^()]) - open brackets, not brackets, close brackets. It should be part of the “find and replace nothing” cycle. This cuts off from the inside.

2) after trimming, you either do not have parentheses, or only leading / final ones. Now you need to find the word before the open parenthesis. This requires a regular expression, for example \w( . But this will not work if there are several closed parentheses. Taking the last could be done with a greedy coincidence (with a group around the last \w ): ^.*\w( "so many characters, as much as you can before the word before the bracket "- this will find the last.

I say “approximate” solution because, depending on the environment you are using, how you say “this corresponding group”, and whether to use a backslash before () changes. I left this detail as it is difficult to check on my iPhone.

I hope this inspires you or others to come up with a complete solution.

0
source

You don’t know which language / regex platform you are using for this, and you don’t know whether subpatterns are allowed on your platform or not. However, after the two-step PHP code will work for all the cases listed above :

 $str = 'aaa(bbb(x), ccc()'; // your original string // find and replace all balanced square brackets with blank $repl = preg_replace('/ ( \( (?: [^()]* | (?1) )* \) ) /x', '', $str); $matched = ''; // find word just before opening square bracket in replaced string if (preg_match('/\w+(?=[^\w(]*\([^(]*$)/', $repl, $arr)) $matched = $arr[0]; echo "*** Matched: [$matched]\n"; 

Live Demo: http://ideone.com/evXQYt

0
source

Source: https://habr.com/ru/post/1485806/


All Articles