Why is QCoreApplication called by `setlocale (LC_ALL," ")` by default on Unix / Linux?

I think it is safe to say that C locales are universally recognized as bad.

Writing an application that tries to parse or write text machine formats (which happens quite often) with the standard C library functions becomes almost impossible if you need to consider that the locale is set to any value other than "C" . Since the locale is usually executed in the process (and setlocale often not thread safe), if you are writing a library or you have a multi-threaded program, it is unsafe even to execute setlocale(LC_ALL, "C") and restores it after performing your actions.

Now for these reasons, the rule usually "avoids setlocale , period"; but: in the past we have bitten the peculiar behavior of QCoreApplication and derived classes several times; the documentation says:

On Unix / Linux, Qt is configured to use the system default locale settings. This can lead to a conflict when using POSIX functions, for example, when converting between data types such as float and strings, since the notation can differ between locales. To work around this problem, call the POSIX setlocale(LC_NUMERIC,"C") immediately after initializing QApplication or QCoreApplication to reset the language used to format numbers in the "C" -locale.

This behavior is described in another question ; my question is: what could be causing this apparently stupid behavior? In particular, what is so strange for Unix and Linux that caused such a solution only on these platforms?

(By the way, will it all break if I just do setlocale(LC_ALL, "C"); after creating QApplication ? If everything is ok, why not just delete their setlocale(LC_ALL, ""); ;?)

+6
source share
3 answers

From research through the Qt source code by @Phil Armstrong and me (see the chat log ), it seems that the setlocale call has existed since version 1 for several reasons:

  • XIM, at least in ancient times, could not correctly "get" the current locale without such a call.
  • On Solaris, it even crashed with the default C locale.
  • On Unix systems, she used (among other systems in a complex backup game) to "sniff" a "set of system characters" (whatever that means on Unix) and thus be able to convert between a QString representation and a "local" 8-bit encoding ( this is especially important for file paths).

It is true that it already checks for LC_* environment variables, as it does with QLocale , but I suppose it might be useful to have nl_langinfo decode the current LC_CTYPE if the application explicitly changed it (but to see if there is an explicit change, it should start with system defaults).

Interestingly, they did setlocale(LC_NUMERIC, "C") immediately after setlocale(LC_ALL, "") , but this was removed in Qt 4.4 . The rationale for this solution seems to lie in task # 132859 of the old Qt bugtracker (which moved between TrollTech, Nokia and QtSoftware.com before disappearing without leaving a trace, even in the Wayback Machine ), and it refers to two errors on this topic. I think that there was an authoritative answer on this topic, but I can not find a way to restore it.

I suppose he introduced subtle errors, as the environment seemed untouched, but in fact it was affected by calling setlocale in all LC_NUMERIC categories (which is most obvious); they probably removed the call to make local customization more obvious and get application developers to act accordingly.

+6
source

Qt calls setlocale(LC_ALL, "") because it is correct: every standard Unix program from cat when calling setlocale(LC_ALL, "") . The consequence of this call is that the language standard of the program is set to what is specified by the user. See the setlocale () man page:

When the main program starts, the portable locale "C" is selected by default. A program can be made portable for all locales by calling:

setlocale(LC_ALL, "");

after initializing the program ...

Given that Qt generates text to be read by the user and analyzes the input created by the user, it would be very unfriendly to refuse to let the user communicate with the user in their own language. Hence the call to setlocale ().

I hope that friendliness will be undeniable! The problem, of course, occurs when you try to analyze data files created by your program running under a different locale. Obviously, if you use the ad-hoc text format with a parser based on sscanf and friends, and not with the specified data format with a "real" parser, then this is a recipe for data corruption if it is done without taking into account the language settings. The solution is to: a) use a real serialization library that processes this material for you, or b) set the language for something specific (possibly "C") when writing and reading data.

If thread safety is a problem, then for modern POSIX implementations (or any Linux system with GNU version libc> = 2.3, which is pretty much β€œall of them” at a given time), you can call uselocale() to set up a local local thread for all I / O operations. Alternatively, you can call _l versions of regular functions that take a locale object as an additional argument.

Will everything break if you call setlocale(LC_ALL, "C"); ? No, but it’s correct for the user to set the locale that they prefer and either save your data in a well-defined format or indicate the language in which your data should be read and written at run time.

+3
source

What is especially true for POSIX systems (including the Unix / Linux systems you mentioned) is that the OS interface and C interface are confused. In particular, calling C setlocale interferes with the OS.

On Windows, in comparison, a locale is clearly a per-thread ( SetThreadLocale ) property, but more importantly, functions like GetNumberFormat accept a locale parameter.

Please note that your problem is quite easily resolved: when using Qt, use Qt. Thus, this means reading your text input into a QString , processing it, and then writing it.

+2
source

Source: https://habr.com/ru/post/974814/


All Articles