Cannot insert non-latin characters in MySQL

Question

Cannot insert non-latin characters in MySQL

I am writing a web application using MySQL version 5.1.45, Tomcat 5.5.28 and Hibernate 3

When I try to save a string containing non-Latin characters (for example, Hulk), an error occurs:

1589 [main] WARN org.hibernate.util.JDBCExceptionReporter - SQL Error: 1366, SQLState: HY000 1589 [main] ERROR org.hibernate.util.JDBCExceptionReporter - Incorrect string value: '\xD0\xA3\xD0\xBF\xD1\x8F...' for column 'name' at row 1

Hibernation Connection Settings

 <property name="connection.driver_class">com.mysql.jdbc.Driver</property> <property name="connection.url">jdbc:mysql://localhost/E2012?characterEncoding=UTF8&amp;useUnicode=true</property> <property name="connection.username">***</property> <property name="connection.password">***</property> <property name="hibernate.connection.charSet">UTF8</property>

MySQL configuration My.cnf

 [client] default-character-set=utf8 [mysqld] default-character-set=utf8

Even the utf-8 query set name does not resolve the problem

Thanks for the help!

+4

java mysql encoding unicode

glebreutov May 04 '10 at 2:17

source share

2 answers

Balusc · Answer 1 · 2010-05-04T02:45:32+0000

In UTF-8, the should actually be represented as \x423\x43F\x44F\x447\x43A\x430 . \xD0\xA3\xD0\xBF\xD1\x8F... implies that they are incorrectly encoded using ISO-8859-1.

Here is a test snippet that proves this:

 String s = new String("".getBytes("UTF-8"), "ISO-8859-1"); // First decode with UTF-8, then (incorrectly) encode with ISO-8859-1. for (char c : s.toCharArray()) { System.out.printf("\\x%X", (int) c); }

What seal

 \xD0\xA3\xD0\xBF\xD1\x8F\xD1\x87\xD0\xBA\xD0\xB0

So, your problem should be solved one step earlier. Since you're talking about a Java web application, and this line is most likely related to user input, are you sure you took care of encoding the HTTP request and response? First, in JSP you need to add the following to JSP:

 <%@ page pageEncoding="UTF-8" %>

This not only displays the page in UTF-8, but also implicitly sets the HTTP Content-Type response header, instructing the client that the page is rendered using UTF-8, so that the client knows that it should display any content and process any forms using the same encoding.

Now, part of the HTTP request, for GET requests you need to configure the appropriate servlet container. For example, in Tomcat this is due to setting the URIEncoding /conf/server.xml attribute, respectively. For POST requests, this should already be accepted by the client (webbrowser) smart enough to use the response encoding as specified in the JSP. If this is not the case, then you need to enter Filter , which checks and sets the encoding of the request.

For more information, you can find this article .

Besides all this, MySQL has another issue with Unicode characters. It only supports UTF-8 characters up to 3 bytes , not 4 bytes. In other words, only the BMP range of 65,535 characters is supported, outside of it. PostgreSQL, for example, fully supports it. This may not hurt your web application, but it is definitely something to keep in mind.

ryanprayogo · Answer 2 · 2010-05-04T02:27:10+0000

Try using UTF-8 for the characterEncoding parameter in the JDBC URL, not UTF8 (note the dash).

This has happened to me before.

Cannot insert non-latin characters in MySQL

More articles: