Problem with the function of removing accents and other characters in PHP

I found a simple function to remove some unwanted characters from a string.

function strClean($input){ $input = strtolower($input); $b = array("á","é","í","ó","ú", "ñ", " "); //etc... $c = array("a","e","i","o","u","n", "-"); //etc... $input = str_replace($b, $c, $input); return $input; } 

When I use it for accents or other characters, like the word "á é ñ", it prints these question marks or strange characters, for example: output http://img217.imageshack.us/img217/6794/59472278.jpg

Note. I use strclean.php (which contains this function) and index.php, as in UTF-8. index.php is as follows:

 <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"> <html> <head> <meta http-equiv="Content-Type" content="text/html; charset=UTF-8"> <title></title> </head> <body> <?php include('strclean.php'); echo 'óóóáà'; echo strClean('óóóáà'); ?> </body> </html> 

What am I doing wrong?

+1
string php unicode utf-8
Mar 03 '09 at 14:40
source share
6 answers

I checked your code and the error in strtolower function ...

Replace it with mb_strtolower as below

 <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"> <html> <head> <meta http-equiv="Content-Type" content="text/html; charset=utf-8" /> <title></title> </head> <body> <?php function strClean($input) { $input = mb_strtolower($input, 'UTF-8'); $b = array("á","é","í","ó","ú", "n", " "); $c = array("a","e","i","o","u","n", "-"); return str_replace($b, $c, $input); } $string = 'á é í ó ú n abcdef ghij'; echo $string ."<br />". strClean($string); ?> </body> </html> 
+2
Mar 03 '09 at 19:09
source share

Using

 iconv('UTF-8', 'ASCII//TRANSLIT', $input); 
+5
Mar 03 '09 at 15:12
source share

You can try iconv .

+4
Mar 03 '09 at 14:49
source share

Is there a replacement at all, i.e. do you get the same weird characters when you type $ input in advance? If so, the character sets of your PHP source code and input do not match, and you may need to use iconv () in the input before replacing.

edit: I took both of your files, uploaded them to my web server, and the job of printing and cleaning is great (see http://www.tag-am-meer.com/test1/ ). This is on PHP 4.4.9 and Firefox 3.0.6. More potential problems that come to my mind:

  • Does this work for you in Firefox? I vaguely remember that IE6 (and probably later versions) expect the encoding in the HTML header section to be lowercase ("utf-8")
  • Does your editor include byte bytes (BOM) in code files? Mine doesn't, maybe PHP is choking on them.
  • You can look at the HTTP headers to see if something unusual happens, such as a bad MIME type? This may help the Tamper Data strong> add-in for Firefox.
+3
Mar 03 '09 at 14:48
source share

Why do you want to remove accents? Is it possible that you just want to ignore them? If so, this answer has a Perl solution that demonstrates how to do this. Please note that Perl is in a foreign language. :)

0
Mar 05 2018-11-11T00:
source share

I ran into this problem before, and I tried to follow the output of this message and others that I found along the way, and there was no easy solution because you need to know the encoding your system uses (in my case, ISO-8859-1 ), and this is what I did:

  function quit_accenture($str){ $pattern = array(); $pattern[0] = '/[Á|Â|À|Å|Ä]/'; $pattern[1] = '/[É|Ê|È]/'; $pattern[2] = '/[Í|Î|Ì|Ï]/'; $pattern[3] = '/[Ó|Ô|Ò|Ö]/'; $pattern[4] = '/[Ú|Û|Ù|Ü]/'; $pattern[5] = '/[á|â|à|å|ä]/'; $pattern[6] = '/[ð|é|ê|è|ë]/'; $pattern[7] = '/[í|î|ì|ï]/'; $pattern[8] = '/[ó|ô|ò|ø|õ|ö]/'; $pattern[9] = '/[ú|û|ù|ü]/'; $replacement = array(); $replacement[0] = 'A'; $replacement[1] = 'E'; $replacement[2] = 'I'; $replacement[3] = 'O'; $replacement[4] = 'U'; $replacement[5] = 'a'; $replacement[6] = 'e'; $replacement[7] = 'i'; $replacement[8] = 'o'; $replacement[9] = 'u'; return preg_replace($pattern, $replacement, $str); } $txt = $_POST['your_htmled_text']; //Convert to your system charset. I checked this on the php.ini $txt = iconv('UTF-8', 'ISO-8859-1//TRANSLIT', $txt); //Apply your function $txt = quit_accenture($txt); //output print_r($txt); 

This worked for me, but I also think this is the right way :)

0
Apr 09 '14 at 17:14
source share



All Articles