I have a MySQL table column that I am trying to convert from latin1 to UTF8. Some of the values in the column are already UTF8, but they are saved as latin1, which leads to some strange text. Column switching is pretty straight forward, all I have to do is:
ALTER TABLE `user_profiles` MODIFY `last_name` varchar(20) CHARACTER SET utf8;
The next step is to convert all now double-encoded columns back to UTF8. I can get a list of all these columns by running the following command:
SELECT `last_name`, CONVERT(CAST(CONVERT(`last_name` USING latin1) AS BINARY) USING utf8) AS `converted_last_name` FROM `user_profiles` WHERE (CONVERT(CAST(CONVERT(`last_name` USING latin1) AS BINARY) USING utf8)) IS NOT NULL AND CONVERT(CAST(CONVERT(`last_name` USING latin1) AS BINARY) USING utf8) != `last_name`;
Which gives me something like:
| last_name | converted_last_name | | ----------------|----------------------| | 王维雄 | 王维雄 | | niño de rivera | niño de rivera | | Thölix | Thölix |
Looks nice. Now that everything is getting weird. If I run the update command:
UPDATE `user_profiles` SET `last_name` = CONVERT(CAST(CONVERT(`last_name` USING latin1) AS BINARY) USING utf8) WHERE (CONVERT(CAST(CONVERT(`last_name` USING latin1) AS BINARY) USING utf8)) IS NOT NULL AND CONVERT(CAST(CONVERT(`last_name` USING latin1) AS BINARY) USING utf8) != `last_name`
I get error 1300
as follows:
#1300 - Invalid utf8 character string: 'E36F'
Any idea why the update is different from the selected one? Any thoughts on how to get around this?
Swift source share