How to replace all non-alphanumeric characters with a space in php?

$html=strip_tags($html); $html=ereg_replace("[^A-Za-zäÄÜüÖö]"," ",$html); $words = preg_split("/[\s,]+/", $html); 

Does all non (AZ, az, aou with umlauts) replace with spaces? I lose words like zugänglich etc. With umlauts

Is there something wrong with regex?

edit:

I replaced ereg_replace with preg_replace, but somehow special characters like :, ® are not replaced by a space ...

+4
source share
4 answers

Check out this link.

I assume that you are as German as I am, so I also assume that you can read this post.

-1
source

If you succeed, your approach depends on the encoding. When all the umlauts were separated, your source code (or php script) was probably encoded as UTF-8.

In this case, rather use:

 $text = preg_replace('/[^\p{L}]/u', " ", $text); 

This will correspond to all literal characters, and not just umlauts. And /u solves your character set probability problem.

+3
source

Perhaps your umlauts are still html entities (& auml, etc.) that contain non-alphanumeric characters that will be deleted ...

BTW: Alphanumeric is not only aZ, but also numbers ...

0
source

regular expression must be /[^A-Za-zäÄÜüÖö]+/

0
source

Source: https://habr.com/ru/post/1346719/


All Articles