To capture the entire IF / ENDIF block using balanced IF statements, you can use this regex:
%IF\s+(?<Name>\w+) (?<Contents> (?>
The point here is this: you cannot commit to more than one of each named group in one Match . You will get only one group (?<Name>\w+) , for example, the last committed value. In my regular expression, I saved the Name and Contents groups of your simple regular expression and limited the balancing within the Contents group - the regular expression is still wrapped in IF and ENDIF .
If it gets interesting when your data is more complex. For instance:
%IF MY_VAR some text %IF OTHER_VAR some other text %ENDIF %IF OTHER_VAR2 some other text 2 %ENDIF %ENDIF %IF OTHER_VAR3 some other text 3 %ENDIF
You will get two matches here: one for MY_VAR and one for OTHER_VAR3 . If you want to write two ifs files to MY_VAR , you need to re-run the regular expression in your Contents group (you can bypass it using lookahead, if you must - wrap the whole regular expression in (?=...) , but you need some -to bring it into a logical structure using positions and lengths).
Now I will not explain too much, because it seems that you are getting the basics, but a short note about the content group. I use a possessive group to avoid backtracking. Otherwise, the point could eventually match the whole IF and break the balance. Len coincidence in the group will behave similarly to ( ( )+? Instead of (?> )+ ).
source share