This will remove everything except letters and spaces:
$tags = preg_replace("/[^az ]/i", "", $tags);
This will then lead to a breakdown of consecutive spaces:
$tags = preg_replace("/ {2,}/", " ", $tags);
If you want to allow other types of space characters, but also replace them with single spaces, try this instead:
$tags = preg_replace("/[^az\s]/i", "", $tags); $tags = preg_replace("/\s+/", " ", $tags);
Regarding your last sentence: there is no general way to this. You will need to add certain rules. However, preg_replace_callback can help you identify unmodified letters.
source share