How can I process the data to avoid the MySQL “invalid string value” error?

I am trying to use the Rake task to migrate some legacy data from MS Access to MySQL. I am working on Windows XP using Ruby 1.8.6.

I have an encoding for Rails set as "utf8" in database.yml .

In addition, the default character set for MySQL is utf8.

99% of the data comes in normally, but from time to time I get a column value that gives me an error something like this:

 Mysql::Error: Incorrect string value: '\x92 Comm...' for column 'name' at row 1: INSERT INTO `organizations` ( [...] ) VALUES('Lawyers' Committee', [...] ) 

It seems that the problem associated with the MySQL problem is the apostrophe immediately after the "s" in the word "Lawyers".

Here is another one ...

 Mysql::Error: Incorrect string value: '\x99 aoc' for column 'department' at row 1: INSERT INTO `addresses` [...] 'TRInfo™ aoc' [....] 

It looks like he is choking on TM after TRInfo.

Is there any Ruby or Rails method with which I can run data to clear any characters that MySQL will choke on?

Ideally, it would be great to replace them with more pleasant characters - replace the apostrophe with one quote and the TM character with the string "(TM)".

Or, if I could somehow configure MySQL to store these characters as is without errors, that would be great too.

+4
source share
7 answers

It looks like your input is not in utf-8.

I figured it out a bit, and the stylized quote used in Lawyer is encoded as \ x92 encoded in Windows-1252, but it would be nonsense for utf-8 (when I decrypted it and encoded it in utf8, I got \ xe2 \ x80 \ x99).

Thus, you will need to convert the input lines from windows-1252 to utf-8 (or to unicode).

+5
source

I had the same problem when placing contents of UTF-16 encoded files, which usually store one character per 16-bit block, in mysql tables with java. The problem was that the UTF-16 encoded string contained the so-called surrogate pairs. This means that two consecutive 16-bit UTF-16 blocks encode one special character, but cannot be translated individually into the corresponding UTF-8 encoding. See wikipedia for more details.

The solution was to simply replace these characters with spaces. This is the range of characters you want to remove from your string: U + D800-U + DFFF

+1
source

In general, this happens when you insert rows into columns with incompatible encoding / matching.

I got this error when I had TRIGGER that for some reason inherit server sorting. And mysql default (at least on Ubuntu) is Latin-1 with Swedish sort. Despite the fact that I had a database and all the tables installed in UTF-8, I still need to install my.cnf :

/etc/mysql/my.cnf:

 [mysqld] character-set-server=utf8 default-character-set=utf8 

And this should contain a list of all triggers using utf8 - *:

 select TRIGGER_SCHEMA, TRIGGER_NAME, CHARACTER_SET_CLIENT, COLLATION_CONNECTION, DATABASE_COLLATION from information_schema.TRIGGERS 

And some of the variables listed here should also have utf-8- * (without Latin-1 or other encoding):

 show variables like 'char%'; 
+1
source

It looks like your old database is in one lowercase format (utf8?) And your rails are waiting for something else. If you are entering utf8, have you tried setting up your rails for support?

0
source
  I encountered the same problem today.
 After tried many times, I found out the reason and fix it at last.
 For applications that store data using the default MySQL character set and collation (latin1, latin1_swedish_ci), so you need to specify the character set and collation to utf8 / utf8_general_ci when your create your database or table.
 eg:
         $ sql = "CREATE TABLE".  $ table_name.  "(
         id mediumint (9) NOT NULL AUTO_INCREMENT,
         bookname varchar (128) NOT NULL,
         author varchar (64) NOT NULL,
         PRIMARY KEY (id),
         KEY (bookname)
         ) CHARACTER SET utf8 COLLATE utf8_general_ci; ";

 Reference:
 《Mysql create table problem?  SOLVED !!!!!!!!!!!!》
 http://forums.mysql.com/read.php?121,193883,193883
 《10.1.5.  Configuring the Character Set and Collation for Applications》
 http://dev.mysql.com/doc/refman/5.0/en/charset-applications.html

 Hoping this can help you.
0
source

Adding a binary before weirdcolumn solves the problem.

In my case, I have an update trigger on tableA to insert data into another table. There are some special characters in the weirdcolumn column, and the update failed with the message: "ERROR 1366 (HY000): invalid string value:" \ xE7 .... ""

After digging a lot, I found a solution by adding binary before the column name of the row or using cast (weirdcolumn as binary);

Hope this helps.

0
source

I had the same problem as importing data from SQL Server to MySql using Php. My solution was utf8_encode() when pasting into MySql and using utf8_decode() when retrieving from MySql to display in the browser. Here you have the FULL code that works well.

 //For string values $Gro2=(is_null($row["GrpNm"]))?"NULL":"\"".mysql_escape_string(utf8_encode($row["GrpNm"]))."\""; $sqlMy ="INSERT INTO `tbl_name` VALUES ($Gro2)"; 

Please note: for new projects use

 mysqli_escape_string() 

link

0
source

Source: https://habr.com/ru/post/1285712/


All Articles