Jani ALOK AshuTosh I have an XML...">

UTF-16 encoding

<?xml version="1.0" encoding="UTF-16"?> <note> <from>Jani</from> <to>ALOK</to> <message>AshuTosh</message> </note> 

I have an XML parser that only supports UTF-8 encoding, but this gives an SAX parser exception. How can I convert UTF-16 to UTF-8?

+4
source share
1 answer

In this case, this is not the XML parser you are using, see section 2.2 of the xml specification :

All XML processors MUST accept UTF-8 and UTF-16 Unicode encodings

XML xml parsers usually get their input wrapped in an InputSource object. This can be built using the Reader parameter, which performs decoding of characters for a given encoding.

 InputStream in = ... InputSource is = new InputSource(new InputStreamReader(in, "utf-16")); 

For a "utf-16" charset, the stream should begin with a byte order sign; if it is not, use "utf-16le" or "utf-16be".

+5
source

Source: https://habr.com/ru/post/1398021/


All Articles