How to sort a list with diacritics without removing diacritics

How to sort a list containing diacritical letters?

The words used in this example are composed.

Now I get a list that displays this:

  • Bab
  • Baz
  • Bez

But I want to get a list that displays this:

  • Baz
  • Bab
  • Bez

The indication of diacritics as letters in itself. Is there any way to do this in C #?

+4
source share
1 answer

If you have configured the culture of the current stream to the language you want to sort, then this should work automatically (provided that you do not want a special individual sort order). Like this

List<string> mylist; .... Thread.CurrentThread.CurrentCulture = new CultureInfo("pl-PL"); mylist.Sort(); 

You will receive a list sorted according to the settings of the Polish culture.

Refresh . If the culture settings do not sort them the way you need, then another option is to implement your own string matching.

Update 2 : string comparison example:

 public class DiacriticStringComparer : IComparer<string> { private static readonly HashSet<char> _Specials = new HashSet<char> { 'Γ©', 'Ε„', 'Γ³', 'ΓΊ' }; public int Compare(string x, string y) { // handle special cases first: x == null and/or y == null, x.Equals(y) ... var lengthToCompare = Math.Min(x.Length, y.Length); for (int i = 0; i < lengthToCompare; ++i) { var cx = x[i]; var cy = y[i]; if (cx == cy) continue; if (_Specials.Contains(cx) || _Specials.Contains(cy)) { // handle special diacritics comparison ... } else { // cx must be unequal to cy -> can only be larger or smaller return cx < cy ? -1 : 1; } } // once we are here the strings are equal up to lengthToCompare characters // we have already dealt with the strings being equal so now one must be shorter than the other return x.Length < y.Length ? -1 : 1; } } 

Disclaimer: I have not tested it, but it should give you a general idea. Also char.CompareTo() does not compare lexicographically, but, according to one source, I found <and> does - cannot guarantee it. In the worst case, you need to convert cx and cy to strings, and then use the default string comparison.

+2
source

Source: https://habr.com/ru/post/1345392/


All Articles