I am trying to write a recursive regular expression to capture blocks of code, but for some reason it seems like they are not grabbing them properly. I would expect the code below to capture the entire element of a function, but instead, it captures the contents of the first statement if.
It almost .+?somehow absorbs the first {, but it should not be greedy, so I don’t understand why it would be.
What makes him act this way?
Script:
use strict;
use warnings;
my $text = << "END";
int max(int x, int y)
{
if (x > y)
{
return x;
}
else
{
return y;
}
}
END
my $regex = qr/
\{
(?:
[^{}]++
|
(?R)
)*
\}
/x;
if ($text =~ m/int\s.+?($regex)/s){
print $1;
}
Conclusion:
{
return x;
}
Expected Result:
{
if (x > y)
{
return x;
}
else
{
return y;
}
}
I know that exists for this purpose Text::Balanced, but I try to do it manually to learn more about regular expressions.