I am reading an XML document (UTF-8) and end up showing content on a web page using ISO-8859-1. As expected, multiple characters are not displayed correctly, for example " and (they are displayed as?).
Can I convert these characters from UTF-8 to ISO-8859-1?
Here is the code snippet that I wrote for this:
BufferedReader br = new BufferedReader(new InputStreamReader(urlConnection.getInputStream(), "UTF-8")); StringBuilder sb = new StringBuilder(); String line = null; while ((line = br.readLine()) != null) { sb.append(line); } br.close(); byte[] latin1 = sb.toString().getBytes("ISO-8859-1"); return new String(latin1);
I'm not quite sure what is going on, but I believe that readLine () causes grief (since the lines will be encoded in Java / UTF-16?). Another option I tried was to replace latin1 with
byte[] latin1 = new String(sb.toString().getBytes("UTF-8")).getBytes("ISO-8859-1");
I have read previous posts on this subject, and I study when I go. Thanks in advance for your help.
java utf-8 character-encoding iso-8859-1
Chocula Aug 13 '09 at 19:08 2009-08-13 19:08
source share