StrLComp vs AnsiStrLComp when called using Unicode strings

Question

StrLComp vs AnsiStrLComp when called using Unicode strings

I have a bit of confusion regarding the string functions "Ansi" and "regular" rtl when called using Unicode strings. I understand that in older versions of Delphi (when Ansistring was the default), that in versions of "Ansi" multibyte characters were processed. Does this mean anything when dealing with Unicode strings? Assuming I need to handle Korean characters, and also that my code should not be compatible with older versions of Delphi, what rtl functions should I use?

+4

unicode delphi string-comparison delphi-xe

Markf Feb 29 '12 at 18:39

source share

3 answers

Not sure what exactly you want to do, but ...

if you want to compare two strings according to your current user localization rules, use AnsiStrLComp to compare with case or AnsiStrLIComp for case- AnsiStrLIComp comparison. Inside these functions, the CompareString function is used with a LOCALE_USER_DEFAULT set.
if you want to compare two strings using the Delphi internal comparison mechanism, use the StrLComp function for case sensitive comparisons or StrLIComp for case insensitive

So, if you compare two identical strings with AnsiStrLComp or AnsiStrLIComp on machines with different user locale settings, you can get different results, but, on the other hand, you can get natural sorting for user language parameters in your application. StrLComp and StrLIComp will work on all machines equally, independently of each other.

+4

TLama Feb 29 '12 at 20:29

source share

The simple answer is that when it comes to Delphi string routines, you should use ANSI ... () functions for Unicode strings.

However, if you are comparing strings (by the way), you may also need to normalize these strings first, depending on the nature and needs (and source of strings) in your application, in order to deal with Unicode Equivalence .

+2

Deltics Feb 29 '12 at 21:55

source share

Marjan venema · Accepted Answer · 2012-02-29T20:32:14+0000

The prefix 'Ansi' of string comparison functions never meant anything except that the locale was taken into account when comparing strings instead of doing “just” a simple binary comparison. In the Unicode world, this is still the case. The Ansi * family of functions also accepts (Unicode) strings as their parameters and takes into account the locale when comparing.

From AnsiCompareStr Document (D2009):

Most locales consider lowercase characters less than the corresponding uppercase characters. This is in contrast to the ASCII order in which lowercase characters are larger than uppercase characters. So setting S1 to “a” and S2 to “A” calls AnsiCompareStr to return a value less than zero, and CompareStr, with the same arguments, returns a value greater than zero.

What is the effect of "accounting locale in accounting" may vary in language. Perhaps this is due to accented characters or not. In versions of Unicode, it can take into account how characters are generated. For example, accented e (é) can be encoded in exactly the same way, but can also be encoded as two separate elements: accent and e.

The SysUtils function includes Ansi * and "normal" comparison functions. They all take strings as their parameters in Unicode Delphi, which really means UnicodeStrings.

If you need to work with AnsiStrings, you need to use the AnsiStrings block. It has the same set of string comparison functions, but on this device they all take AnsiStrings as their parameters.

Now, if you do not need compatibility with older versions: use the standard functions from SysUtils. Use normal if byte comparison is enough. Use Ansi if you need to consider locale considerations.

StrLComp vs AnsiStrLComp when called using Unicode strings

More articles: