Java file name encoding

I am running a small Java application on the embedded Linux platform. After replacing the Java VM JamVM with OpenJDK, file names with special characters are not saved correctly. Special characters, such as umlauts, are replaced by question marks.

Here is my test code:

import java.io.File; import java.io.IOException; public class FilenameEncoding { public static void main (String[] args) { String name = "umlaute-äöü"; System.out.println("\nname = " + name); System.out.print("name in Bytes: "); for (byte b : name.getBytes()) { System.out.print(Integer.toHexString(b & 255) + " "); } System.out.println(); try { File f = new File(name); f.createNewFile(); } catch (IOException e) { e.printStackTrace(); } } } 

The launch is as follows:

 name = umlaute-??? name in Bytes: 75 6d 6c 61 75 74 65 2d 3f 3f 3f 

and a file named umlaute - ???.

Setting the file.encoding and sun.jnu.encoding properties in UTF-8 gives the correct lines in the terminal, but the created file is still umlaute - ???

Starting a virtual machine using strace, I see a system call

 open("umlaute-???", O_RDWR|O_CREAT|O_EXCL|O_LARGEFILE, 0666) = 4 

This shows that the problem is not a file system problem, but one of the virtual machines.

How can I set the encoding of a file name?

+6
source share
3 answers

If you use Eclipse, you can go to Window-> Preferences-> General-> Workspace and select the option "Text file encoding" from the drop-down menu. By changing my world, I was able to recreate your problem (and also return to the fix).

If you do not, you can add an environment variable to the windows (System Properties → Environment Variables and under the system variables you want to select New ...) The name should be (without quotation marks) JAVA_TOOL_OPTIONS , and the value should be set to -Dfile.encoding=UTF8 (or any other encoding will help you work.

I found the answer through this post, by the way: Setting default Java character encoding?

Linux Solutions

- (Permanent) Using env | grep LANG env | grep LANG in the terminal, you will get one or two responses to what linux is currently configured. Then you can install LANG in UTF8 (yours can be installed in ASCII) in the i / mn / sysconfig i18n file (I tested this on 2.6.40 fedora). Bascially, I switched from UTF8 (where I had odd characters) to ASCII (where I had question marks) and vice versa.

- (when starting the JVM, but may not fix the problem). You can start the JVM with the encoding you want to use java -Dfile.encoding = **** FilenameEncoding Here is the result of two ways:

 [ youssef@JoeLaptop bin]$ java -Dfile.encoding=UTF8 FilenameEncoding name = umlaute-הצ  name in Bytes: 75 6d 6c 61 75 74 65 2d d7 94 d7 a6 ef bf bd UTF-8 UTF8 [ youssef@JoeLaptop bin]$ java FilenameEncoding name = umlaute-??????? name in Bytes: 75 6d 6c 61 75 74 65 2d 3f 3f 3f 3f 3f 3f 3f US-ASCII ASCII 

Here are some links to linux material http://www.cyberciti.biz/faq/set-environment-variable-linux/

and here is one from -Dfile.encoding Setting the default Java character encoding?

+3
source

Your problem is that javac expects a different encoding for your .java -file than you saved it. Did javac warn you when compiling?

You might have saved it with ISO-8859-1 or windows-1252 UTF-8 , and javac expecting UTF-8 .

Specify the correct javac encoding using the -encoding flag or equivalent for your build tool.

0
source

I know this is an old question, but I had the same problem. All of the solutions mentioned did not work for me, but the following ones decided:

  • Source encoding for UTF8 (project.build.sourceEncoding for UTF-8 in maven properties)
  • Program arguments: -Dfile.encoding = utf8 and -Dsun.jnu.encoding = utf8
  • Using java.nio.file.Path instead of java.io.File
0
source

Source: https://habr.com/ru/post/912932/


All Articles