It all depends on where your data comes from: external and uncontrolled sources can provide pretty dirty data. A hint for those of you who are trying to discourage (or at least work out) the problem of the correct pattern matching at the end ($) of any line in multi-line mode (/ m).
<?php // Various OS-es have various end line (aka line break) chars: // - Windows uses CR+LF (\r\n); // - Linux LF (\n); // - OSX CR (\r). // And that why single dollar meta assertion ($) sometimes fails with multiline modifier (/m) mode - possible bug in PHP 5.3.8 or just a "feature"(?). $str="ABC ABC\n\n123 123\r\ndef def\rnop nop\r\n890 890\nQRS QRS\r\r~-_ ~-_"; // C 3 p 0 _ $pat1='/\w$/mi'; // This works excellent in JavaScript (Firefox 7.0.1+) $pat2='/\w\r?$/mi'; // Slightly better $pat3='/\w\R?$/mi'; // Somehow disappointing according to php.net and pcre.org when used improperly $pat4='/\w(?=\R)/i'; // Much better with allowed lookahead assertion (just to detect without capture) without multiline (/m) mode; note that with alternative for end of string ((?=\R|$)) it would grab all 7 elements as expected $pat5='/\w\v?$/mi'; $pat6='/(*ANYCRLF)\w$/mi'; // Excellent but undocumented on php.net at the moment (described on pcre.org and en.wikipedia.org) $n=preg_match_all($pat1, $str, $m1); $o=preg_match_all($pat2, $str, $m2); $p=preg_match_all($pat3, $str, $m3); $r=preg_match_all($pat4, $str, $m4); $s=preg_match_all($pat5, $str, $m5); $t=preg_match_all($pat6, $str, $m6); echo $str."\n1 !!! $pat1 ($n): ".print_r($m1[0], true) ."\n2 !!! $pat2 ($o): ".print_r($m2[0], true) ."\n3 !!! $pat3 ($p): ".print_r($m3[0], true) ."\n4 !!! $pat4 ($r): ".print_r($m4[0], true) ."\n5 !!! $pat5 ($s): ".print_r($m5[0], true) ."\n6 !!! $pat6 ($t): ".print_r($m6[0], true); // Note the difference among the three very helpful escape sequences in $pat2 (\r), $pat3 and $pat4 (\R), $pat5 (\v) and altered newline option in $pat6 ((*ANYCRLF)) - for some applications at least. /* The code above results in the following output: ABC ABC 123 123 def def nop nop 890 890 QRS QRS ~-_ ~-_ 1 !!! /\w$/mi (3): Array ( [0] => C [1] => 0 [2] => _ ) 2 !!! /\w\r?$/mi (5): Array ( [0] => C [1] => 3 [2] => p [3] => 0 [4] => _ ) 3 !!! /\w\R?$/mi (5): Array ( [0] => C [1] => 3 [2] => p [3] => 0 [4] => _ ) 4 !!! /\w(?=\R)/i (6): Array ( [0] => C [1] => 3 [2] => f [3] => p [4] => 0 [5] => S ) 5 !!! /\w\v?$/mi (5): Array ( [0] => C [1] => 3 [2] => p [3] => 0 [4] => _ ) 6 !!! /(*ANYCRLF)\w$/mi (7): Array ( [0] => C [1] => 3 [2] => f [3] => p [4] => 0 [5] => S [6] => _ ) */ ?>
Unfortunately, I do not have access to the server with the latest version of PHP - my local PHP is 5.3.8, and my public PHP host is version 5.2.17.
Wirek source share