PostgreSQL 8.4 Coding Error

I am importing data from a CSV file. One of the fields has an accent (Telefónica O2 UK Limited). The application causes an en error when inserting data into a table.

PGError: ERROR: invalid byte sequence for encoding "UTF8": 0xf36e6963 HINT: This error can also happen if the byte sequence does not match the encoding expected by the server, which is controlled by "client_encoding". : INSERT INTO "companies" ("name", "validated") VALUES(E'Telef?nica O2 UK Limited', 't') 

Entering data through forms works when entering names with accents and umlaut. How do I solve this problem?

Edit

I addressed the problem by converting the file encoding. I uploaded the CSV file to Google documents and exported the file to CSV.

+4
source share
2 answers

The error message is pretty clear: your client_encoding parameter client_encoding set to UTF8 , and you are trying to insert a character that is not encoded in UTF8 (if it is CSV from MS Excel, your file is probably encoded in Windows-1252).

You can either convert it to your application, or change your PostgreSQL connection to match the encoding you want to insert (in this way PostgreSQL can perform the conversion). You can do this by doing SET CLIENT_ENCODING TO 'WIN1252'; in its PostgreSQL connection before trying to insert this data. After importing you should reset its original value with RESET CLIENT_ENCODING;

NTN!

+6
source

I think you can try using Ruby gem rchardet, which might be a better solution. Code example:

 require 'rchardet' cd = CharDet.detect(string_of_unknown_encoding) encoding = cd['encoding'] converted_string = Iconv.conv('UTF-8′, encoding, str_of_unknown_encoding) 

Here are some links:

https://github.com/jmhodges/rchardet

http://www.meeho.net/blog/2010/03/ruby-how-to-detect-the-encoding-of-a-string/

+1
source

Source: https://habr.com/ru/post/1309354/


All Articles