I am trying to remove duplicate space characters from a UTF8 string in PHP using regex. This is a regex
$txt = preg_replace( '/\s+/i' , ' ', $txt );
usually works fine, but some of the lines have the Cyrillic letter "P", which is screwed after replacement. After a little research, I realized that the letter is encoded as \ x {D0A0}, and since \ xA0 is an inextricable space in ASCII, the regular expression replaces it with \ x20, and the character is no longer valid.
Any ideas how to do this correctly in PHP with regex?
source share