The problem is that your line contains new lines. only single lines correspond to the template, you need to add the / s modifier. to match multi-line lines
Here is my solution, I prefer it that way.
<?php $html=<<<EOD <html> <head> </head> <body buu="grger" ga="Gag"> <p>Some text</p> </body> </html> EOD; // get anything between <body> and </body> where <body can="have_as many" attributes="as required"> if (preg_match('/(?:<body[^>]*>)(.*)<\/body>/isU', $html, $matches)) { $body = $matches[1]; } // outputing all matches for debugging purposes var_dump($matches); ?>
Edit: I am updating my answer to give you a more complete explanation of why your code is not working.
You have this line:
<html> <head> </head> <body> <p>Some text</p> </body> </html>
Everything seems to be in order, but on each line you have non-printable characters (new string characters). You have 53 printable characters and 7 non-printable characters (new lines, \ n == 2 characters for each new line).
When you reach this part of the code:
$index_of_body_end_tag = strpos($html, '</body>');
You get the correct position </body> (starting at position 51), but this counts new lines.
So, when you reach this line of code:
$index_of_body_start_tag + strlen($matched_body_start_tag)
It evaluates to 31 (including newlines) and:
$index_of_body_end_tag - $index_of_body_start_tag + strlen($matched_body_start_tag)
It evaluates to 51 - 25 + 6 = 32 (characters you should read), but you only have 16 printable text characters between <body> and </body> and 4 non-printable characters (new line after <body> and new line before </body>). And here is the problem, you need to group the calculations (priorities) as follows:
$index_of_body_end_tag - ($index_of_body_start_tag + strlen($matched_body_start_tag))
evaluated to 51 - (25 + 6) = 51 - 31 = 20 (16 + 4).
:) Hope this helps you understand why prioritization is important. (Sorry for misleading about new characters, this is only true in the regex example that I gave above).