You must use UTF-8 to the end. Make sure that:
your database connection is UTF-8 (using mysql_set_charset );
output pages are marked as UTF-8 ( <meta http-equiv="Content-Type" content="text/html;charset=utf-8"> );
when you output rows from a database, you encode their HTML with htmlspecialchars() , not htmlentities() .
htmlentities HTML encodes all non-ASCII characters, and by default assumes that you pass bytes to ISO-8859-1. Therefore, if you pass it " encoded as UTF-8 (bytes 0xE2, 0x80, 0x9C), you will get “ instead of the expected “ or “ This can be fixed by passing utf-8 as an optional argument to $charset .
However, it is usually simpler to just use htmlspecialchars() , as this leaves only non-ASCII characters as source bytes instead of references to HTML objects. This results in a smaller page release, so it is preferable if you are sure that the HTML you create will contain encoding information (which you can usually rely on, except in context, such as sending HTML fragments by mail or something else).
htmlspecialchars() has an optional $charset argument, but setting it to utf-8 not critical, since it does not change the default behavior in ISO-8859-1 encoding. If you are generating output in old-school multibyte encodings such as Shift-JIS, you have to worry about setting this argument correctly, but today it is pretty rare, as most sane people prefer UTF-8.
source share