I have the following gettext .po file that has been translated from a .pot file. I am working on a Linux system ( openSUSE , if that matters), running gettext 0.17.
# # < translate@transme.de >, 2011 # transer < translate@transme.de >, 2011 msgid "" msgstr "" "Project-Id-Version: transtest\n" "Report-Msgid-Bugs-To: \n" "POT-Creation-Date: 2011-05-24 22:47+0100\n" "PO-Revision-Date: 2011-05-30 23:03+0100\n" "Last-Translator: \n" "Language-Team: German (Germany)\n" "MIME-Version: 1.0\n" "Content-Type: text/plain; charset=UTF-8\n" "Content-Transfer-Encoding: 8bit\n" "Language: de_DE\n" "Plural-Forms: nplurals=2; plural=(n != 1)\n" #: transtest.cpp:12 msgid "Min Size" msgstr "Min GrΓΆΓe"
Now when I create the .mo file through
msgfmt -c transtest_de_DE.po -o transtest.mo
Then I check the encoding with the file command,
file --mime transtest_de_DE.po transtest_de_DE.po: text/x-po; charset=utf-8
and then install it in my locale folder and run the program after exporting LANG and LC_CTYPE , I get garbage in which there are two characters other than ASCII.
If I set my terminal encoding to ISO-8859-2 and not UTF-8 , then I see two characters correctly.
Inside the generated .mo file with a text editor, the file is also in UTF-8 (I can see the characters if I set the encoding of the UTF-8 editor).
The program is very simple, and it looks like this:
#include <iostream> #include <locale> const char *PROGRAM_NAME="transtest"; using namespace std; int main() { setlocale (LC_ALL, ""); bindtextdomain( PROGRAM_NAME, "/usr/share/locale" ); textdomain( PROGRAM_NAME ); cerr << gettext("Min Size") << endl; }
I set the .mo file to /usr/share/locale/de_DE/LC_MESSAGES/transstest.mo and I exported LC_CTYPE and LANG as "de_DE".
$ echo $LC_CTYPE; echo $LANG de_DE de_DE
Where am I mistaken? Why does gettext give me the wrong encoding (ISO-8859-2) for my lines and not the requested (in the .po file) UTF-8?
Edit:
The solution was in the Stack issue with overflow. You cannot force (UTF-8) the traditional Chinese character to work in PHP gettext extension (.po and .mo files created in poEdit) , and it seems to me that I need to explicitly call
bind_textdomain_codeset(PROGRAM_NAME, "utf-8");
The last program looks like this:
#include <iostream> #include <locale> const char *PROGRAM_NAME="transtest"; using namespace std; int main() { setlocale (LC_ALL, ""); bindtextdomain( PROGRAM_NAME, "/usr/share/locale" ); bind_textdomain_codeset(PROGRAM_NAME, "utf-8"); textdomain( PROGRAM_NAME ); cerr << gettext("Min Size") << endl; }
No changes to any of my gettext files are required.