Problem with Java BufferedReader Text File

Problem: Arabic words in my text files read by java show as a series of question marks: ??????

Here is the code:

        File[] fileList = mainFolder.listFiles();
        BufferedReader bufferReader = null;
        Reader reader = null;


        try{

        for(File f : fileList){           
            reader = new InputStreamReader(new FileInputStream(f.getPath()), "UTF8");
            bufferReader = new BufferedReader(reader);
            String line = null;

            while((line = bufferReader.readLine())!= null){
               System.out.println(new String(line.getBytes(), "UTF-8"));
            }              

        }
        }
        catch(Exception exc){
            exc.printStackTrace();
        }

        finally {
            //Close the BufferedReader
            try {
                if (bufferReader != null)
                    bufferReader.close();
            } catch (IOException ex) {
                ex.printStackTrace();
            }

As you can see, I defined the UTF-8 encoding in different places and still get question marks, do you have any ideas how I can fix this?

thank

+3
source share
2 answers

Replace

System.out.println(new String(line.getBytes(), "UTF-8"));

by

System.out.println(line);

String#getBytes() charset , , UTF-8 . UTF-8 InputStreamReader, .

, , ( ) UTF-8. , Eclipse a > > > > a > > UTF-8.

. :

+2

, , Unicode . :

char[] chars = line.toCharArray();
for (int i = 0; i < chars.length; i++)
{
    System.out.println(i + ": " + chars[i] + " - " + (int) chars[i]);
}

Unicode.

, 63, ... , UTF-8, .

, , "?" , 63, , .

+3

Source: https://habr.com/ru/post/1781758/


All Articles