Convert String to Android JSONObject loses utf-8

I am trying to get a (JSON formatted) string from a url and use it as a Json object. I lose UTF-8 encoding when I convert String to JSONObject.

This is the function I use to connect to the url and get the string:

private static String getUrlContents(String theUrl) {
    StringBuilder content = new StringBuilder();
    try {
        URL url = new URL(theUrl);
        URLConnection urlConnection = url.openConnection();
        BufferedReader bufferedReader = new BufferedReader(new InputStreamReader(urlConnection.getInputStream()));

        String line;
        while ((line = bufferedReader.readLine()) != null) {
            content.append(line + "\n");
        }
        bufferedReader.close();
    } catch(Exception e) {
        e.printStackTrace();
    }

    return content.toString();
}

When I receive data from the server, the following code displays the correct characters:

String output = getUrlContents(url);
Log.i("message1", output);

But when I convert the output string to JSONObject, Persian characters become question marks like this ??????. (messages is the name of the array in JSON)

JSONObject reader = new JSONObject(output);
String messages = new String(reader.getString("messages").getBytes("ISO-8859-1"), "UTF-8");
Log.i("message2", messages);
+4
source share
4 answers

Java, ( message) ISO-8859-1, , UTF-8.

new String(reader.getString("messages").getBytes("ISO-8859-1"), "UTF-8");

:

String messages = reader.getString("messages");
+5

:

    private static String getUrlContents(String theUrl) {
        StringBuilder content = new StringBuilder();
        try {
            URL url = new URL(theUrl);
            URLConnection urlConnection = url.openConnection();
            BufferedReader bufferedReader = new BufferedReader(new InputStreamReader(urlConnection.getInputStream(), "utf-8"));

            String line;
            while ((line = bufferedReader.readLine()) != null) {
                content.append(line).append("\n");
            }
            bufferedReader.close();
        } catch(Exception e) {
            e.printStackTrace();
        }

        return content.toString().trim();
    }
+1

:

  • , . InputStreamReader, , . Content-type HTTP charset. JSON, , UTF-8, UTF-16 UTF-32, . . , .

  • String messages = new String(reader.getString("messages").getBytes("ISO-8859-1"), "UTF-8");, , ( , ascii) - ISO-8995-1, UTF-8.

A simple regex pattern can be used to extract a value charsetfrom the Content-type header before reading the input stream. I also included a neat InputStream -> String converter.

private static String getUrlContents(String theUrl) {

    try {
        URL url = new URL(theUrl);
        URLConnection urlConnection = url.openConnection();
        InputStream is = urlConnection.getInputStream();

        // Get charset field from Content-Type header
        String contentType = urlConnection.getContentType();
        // matches value in key / value pair
        Pattern encodingPattern = Pattern.compile(".*charset\\s*=\\s*([\\w-]+).*");
        Matcher encodingMatcher = encodingPattern.matcher(contentType);
        // set charsetString to match value if charset is given, else default to UTF-8
        String charsetString = encodingMatcher.matches() ? encodingMatcher.group(1) : "UTF-8";

        // Quick way to read from InputStream.
        // \A is a boundary match for beginning of the input
        return new Scanner(is, charsetString).useDelimiter("\\A").next();
    } catch(Exception e) {
        e.printStackTrace();
    }

    return null;
}
+1
source

Not sure if this will help, but you can do something like this:

JSONObject result = null;
String str = null;
try 
{           
    str = new String(output, "UTF-8");
    result = (JSONObject) new JSONTokener(str).nextValue();
} 
catch (Exception e) {}

String messages = result.getString("messages");
0
source

Source: https://habr.com/ru/post/1623490/


All Articles