UTF-8 is the only encoding that can handle all of these alphabets. It is also the default encoding for XML and the only encoding that makes sense for a modern application. (In any case, for storage / posting, for internal processing, the type of your language string will most likely be UTF-16 or 32.)
It seems that due to an error in the input file, a problem arose rather than a problem with the choice of output encoding. It may have been encoded in something other than UTF-8, but forgot to include the <?xml encoding?> Declaration in it. Or maybe there is an incorrect ISO-2202-JP escape sequence? (This is the horror of coding.)
You should try loading the input file into something that parses XML (like Firefox or IE) and see what errors, if any, are occurring.
(You cannot mix encodings in one XML file. If you spit out strings from different sources in XML, you have already lost. How is this XML generated?)
source share