Why my DOM analyzer cannot read UTF-8

I have a problem with the fact that my DOM parser cannot load the file when there are UTF-8 characters in the XML file Now, I know that I have to give it instructions on reading utf-8, but I do not know how to put it in my code here it is:

File xmlFile = new File(fileName); DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance(); DocumentBuilder dBuilder = dbFactory.newDocumentBuilder(); Document doc = dBuilder.parse(xmlFile); doc.getDocumentElement().normalize(); 

I know that there is a setencoding () method, but I donโ€™t know where to put it in my code ...

+4
source share
3 answers

Try it. Worked for me

  InputStream inputStream= new FileInputStream(completeFileName); Reader reader = new InputStreamReader(inputStream,"UTF-8"); InputSource is = new InputSource(reader); is.setEncoding("UTF-8"); DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance(); DocumentBuilder dBuilder = dbFactory.newDocumentBuilder(); Document doc = dBuilder.parse(is); 
+10
source

Try using Reader and provide the encoding as a parameter:

 InputStream inputStream = new FileInputStream(fileName); documentBuilder.parse(new InputSource(new InputStreamReader(inputStream, "UTF-8"))); 
+5
source

I used what Eugene did there, and changed it a bit.

 DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance(); DocumentBuilder dBuilder = dbFactory.newDocumentBuilder(); FileInputStream in = new FileInputStream(new File("XML.xml")); Document doc = dBuilder.parse(in, "UTF-8"); 

although it will be read as UTF-8 , if you type in the eclipse console, it will not show any "UTF-8" characters unless the java file is saved as "UTF-8" or at least that happened to me

-1
source

Source: https://habr.com/ru/post/1479360/


All Articles