The VTD-XML library seems to read byte array data. In this case, I would suggest converting the string to bytes using the correct encoding.
If encoding is specified at the beginning of the XML string:
<?xml version="1.0" encoding="UTF-8"?>
Then use this:
myString.getBytes("UTF-8")
If the encoding does not exist, use it so that VTD-XML can decode bytes:
String withHeader = "<?xml version=\"1.0\" encoding=\"UTF-8\"?>" + myString; byte[] bytes = withHeader.getBytes("UTF-8"); VTDGen vg = new VTDGen(); vg.setDoc(bytes); vg.parse(true);
Please note that in a later case, you can use any valid encoding, because the string you have in memory is encoding-agnostic (this is in UTF-16, but when you ask for bytes to be converted).
source share