I have a file containing Unicode characters on a server with Linux. If I use SSH on the server and use tab-completion to navigate to a file / folder containing Unicode characters, I have no problem accessing the file / folder. The problem occurs when I try to access a file through PHP (the function that I accessed on the file system was stat). If I output the path generated by the PHP script to the browser and paste it into the terminal, the file also seems to exist (even if you look at the terminal, the file paths are exactly the same).
I am installing PHP to use UTF8 as its default encoding through php_ini, as well as for installation mb_internal_encoding. I checked the string encoding with the PHP file file and it comes out as UTF8, as you would expect. hexdumpAfter thinking a little more, I decided the symbol é, that the terminal tab is terminated and compare it with the hexdump"regular" character created by the PHP script, or by manually entering the character through the keyboard (option + e + e on os x). Here is the result:
echo -n é | hexdump
0000000 cc65 0081
0000003
echo -n é | hexdump
0000000 a9c3
0000002
The é character, which allows a correct link to a file in the terminal, is 3 bytes. I'm not sure where to go from here, what encoding should I use in PHP? Should I convert the path to another encoding through iconvor mb_convert_encoding?
source
share