I had a similar problem with Hebrew text. I found out that this was caused by the default encoding.
To check the default encoding, I used this code:
OutputStreamWriter out = new OutputStreamWriter(new ByteArrayOutputStream()); String encoding = out.getEncoding();
On my computer, the encoding is "UTF8". On the GAE server, this is "ASCII".
I solved the problem by replacing all the file readers in my code:
new InputStreamReader(new FileInputStream(file), "UTF8"));
This tells Java to ignore the default encoding and opens all input files as UTF8.
source share