I am trying to break a string with text into words using the php function preg_split.
$words = preg_split('/\W/u',$text);
It works great, with the exception of the Swedish lite åäö characters. Running utf8_encode or decoding does not help either. I assume that preg_split only works with single-byte characters and that Swedish characters are multi-byte. Is there any other way to do this?
source
share