Problem with encoding from database in javamail

I have a small application that reads from an Oracle 9i database and sends data via email using JavaMail. The database has NLS_CHARACTERSET = "WE8MSWIN1252" that it is, CP1252.

If I run the application without any parameters, it works fine and emails are sent correctly. However, I have a request that forces me to run the application with the -Dfile-encoding=utf8 parameter, as a result of which the text is sent with corrupted characters.

I tried to change the encoding of the data read from the database using

 String textToSend = new String(textRead.getBytes("CP1252"), "UTF-8"); 

But that does not help. I tried all possible combinations with CP1252, windows-1252, ISO-8859-1 and UTF-8 , but still no luck.

Any ideas?


Update to clarify my problem: when I do the following:

 Statement stat = connection.createStatement(ResultSet.TYPE_SCROLL_INSENSITIVE, ResultSet.CONCUR_READ_ONLY); stat.executeQuery("SELECT blah FROM blahblah ..."); ResultSet rs = stat.getResultSet(); String textRead = rs.getString("whatever"); 

I get textRead corrupted because the database is CP1252 and the application is running in UTF-8. Another approach that I tried but also failed:

 InputStream is = rs.getBinaryStream("whatever"); Writer writer = new StringWriter(); char[] buffer = new char[1024]; Reader reader = new BufferedReader(new InputStreamReader(stream, "UTF-8")); while ((n = reader.read(buffer)) != -1) { writer.write(buffer, 0, n); } String textRead = writer.toString(); 
+4
source share
5 answers

Your driver should do the automatic conversion, and since cp-1252 is a subset of UTF-8, you should not lose information.

Can you try the following: get the line with ResultSet.getString , write the line to the file. Open the file with an editor, with which you can specify the UTF-8 character set (for example, jEdit).

The file must contain UTF-8 data.

+2
source

You seem to get lost in the character space - I understand that ... :-)

This line

 String textToSend = new String(textRead.getBytes("CP1252"), "UTF-8"); 

doesn't make much sense. You already have the text converted to encoded "cp1252" bytes []. Then you tell VM to process the bytes as if they were "UTF-8" (that's a lie ...).

In short: if you have a string, as in textRead , you do not need to convert it at all. If something goes wrong, either the text is already rotten (look at it in the debugger), or later get bogged down in the API. Check it out and come back with more details? Where the text is wrong, and where you read or write it for sure ...

+1
source

Your database data is in windows-1252 . So, assuming that it will be passed verbatim to the JDBC driver - when you try to convert it to a Java String , this is the encoding you must specify:

 Statement stat = connection.createStatement(ResultSet.TYPE_SCROLL_INSENSITIVE, ResultSet.CONCUR_READ_ONLY); ResultSet rs = stat.executeQuery("SELECT blah FROM blahblah ..."); byte[] rawbytes = rs.getBytes("whatever"); String textRead = new String(rawbytes, "windows-1252"); 

Is part of a data sending requirement like UTF-8? If so, part of UTF-8 should run on the output side, not the input side. When you have String data in Java, it is stored inside UTF-16. Therefore, when you serialize it to MimeMessage, you again need to select the encoding:

 mimebodypart.setText(textRead, "UTF-8"); 
+1
source

I had the same problem:

Orace database using encoding WE8MSWIN1252, some data / column text VARCHAR2 containing euro (€) in it. Sending text using JavaMail has given problems on the euro sign.

Finally, it works. Two important things you should check / do:

  • be sure to use the latest Oracle JDBC driver for your version of Java.
  • specify the encoding (prefer: UTF-8) in JavaMail, for example,

    MimeMessage.setSubject (string text, "UTF-8")
    MimeMessage.setText (string text, "UTF-8" "). Thus, the email text is encoded in UTF-8.

    NOTE. Because RFC 821 restricts mail messages to 7-bit US-ASCII, 8-bit characters, or binary data for encoding in 7-bit format. The "Content-Transfer-Encoding" email header indicates the encoding used. For more information: http://www.w3.org/Protocols/rfc1341/5_Content-Transfer-Encoding.html
+1
source

Can you do the conversion in the database? Instead:

 SELECT blah FROM blahblah 

Try

 SELECT convert(blah, 'WE8MSWIN1252', 'UTF8') FROM blahblah 
0
source

Source: https://habr.com/ru/post/1335403/


All Articles