Simple division into spaces will not work correctly if instead of space in the sentence structure there is an unexpected character or the sentence contains several combined spaces.
The next version will work no matter what โspaceโ you use between words and can be easily expanded to handle other characters ... it currently supports any space character plus.;?
function get_snippet( $str, $wordCount = 10 ) { return implode( '', array_slice( preg_split( '/([\s,\.;\?\!]+)/', $str, $wordCount*2+1, PREG_SPLIT_DELIM_CAPTURE ), 0, $wordCount*2-1 ) ); }
Regular expressions are ideal for this problem, because you can easily make the code flexible or strict as you like. However, you need to be careful. I specifically approached the above targeting for spaces between words - not the words themselves; because itโs quite difficult to unambiguously indicate what the word will define.
Take the word boundary \w or its inverse \w . I rarely rely on them, mainly because - depending on the software you use (for example, certain versions of PHP) - they do not always include UTF-8 or Unicode characters .
In regular expressions, it is better to be specific at all times. So that your expressions can handle things like the following, no matter where they appear:
echo get_snippet(' , ', 5);
Avoiding separation may be useful, but in terms of performance. So you can use the Kelly approach, but switch to \w on [^\s,\.;\?\!]+ And \w on [\s,\.;\?\!]+ . Although I personally like the simplicity of the cleavage expression used above, it is easier to read and therefore change. However, the PHP function stack is a little ugly :)
Pebbl Sep 16 2018-12-12T00: 00Z
source share