The benchmark in different string dictionaries shows that normal is faster, and I don’t know why

I do some tests for several string publishers because I wanted to know how different configurations work, and I was surprised that a regular dictionary is faster. Probably because I'm missing a concept, or I'm doing something wrong, but these are the results that I got:

Collection: 2147482 items. Random Keys: 1000 keys. Normal dicctionary Add 1000 items: 573ms. Get 1000 keys: 0ms. Normal Dicctionary with OrdinalIgnoreCase comparer Add 1000 items: 642ms. Get 1000 keys: 0ms. Normal Dicctionary with InvariantCultureIgnoreCase comparer Add 1000 items: 1661ms. Get 1000 keys: 0ms. Sorted dicctionary Add 1000 items: 11996ms. Get 1000 keys: 5ms. Sorted dicctionary with OrdinalIgnoreCase comparer Add 1000 items: 11097ms. Get 1000 keys: 5ms. Sorted dicctionary with InvariantCultureIgnoreCase comparer Add 1000 items: 9814ms. Get 1000 keys: 5ms. 

This is the code I use to test it:

  static void Main(string[] args) { String[] col = GenerateCollectionUnsorted(Int32.MaxValue / 1000).ToArray(); Int32 len = col.Length; Console.WriteLine("Collection:\t\t" + len.ToString() + " items."); String[] randomKeys = GetRandomKeys(col, 1000); Console.WriteLine("Random Keys:\t\t" + randomKeys.Length.ToString() + " keys."); GC.Collect(); GC.WaitForPendingFinalizers(); Console.WriteLine(); Console.WriteLine("Normal dicctionary"); TestDic(col, randomKeys, new Dictionary<String, String>()); GC.Collect(); GC.WaitForPendingFinalizers(); Console.WriteLine(); Console.WriteLine("Normal Dicctionary with OrdinalIgnoreCase comparer"); TestDic(col, randomKeys, new Dictionary<String, String>(StringComparer.OrdinalIgnoreCase)); GC.Collect(); GC.WaitForPendingFinalizers(); Console.WriteLine(); Console.WriteLine("Normal Dicctionary with InvariantCultureIgnoreCase comparer"); TestDic(col, randomKeys, new Dictionary<String, String>(StringComparer.InvariantCultureIgnoreCase)); GC.Collect(); GC.WaitForPendingFinalizers(); Console.WriteLine(); Console.WriteLine("Sorted dicctionary"); TestDic(col,randomKeys, new SortedDictionary<String,String>()); GC.Collect(); GC.WaitForPendingFinalizers(); Console.WriteLine(); Console.WriteLine("Sorted dicctionary with OrdinalIgnoreCase comparer"); TestDic(col, randomKeys, new SortedDictionary<String, String>(StringComparer.OrdinalIgnoreCase)); GC.Collect(); GC.WaitForPendingFinalizers(); Console.WriteLine(); Console.WriteLine("Sorted dicctionary with InvariantCultureIgnoreCase comparer"); TestDic(col, randomKeys, new SortedDictionary<String, String>(StringComparer.InvariantCultureIgnoreCase)); GC.Collect(); GC.WaitForPendingFinalizers(); Console.WriteLine("\nEnd"); Console.ReadKey(true); } static void TestDic(String[] col, String[] randomKeys, IDictionary<String,String> dic) { Stopwatch crono = new Stopwatch(); crono.Start(); foreach (String s in col) dic.Add(s, s); crono.Stop(); Console.WriteLine("\tAdd "+randomKeys.Length.ToString()+" items:\t\t" + crono.ElapsedMilliseconds.ToString() + "ms."); crono.Reset(); crono.Start(); String sv; foreach (var rk in randomKeys) { sv = dic[rk]; sv.ToString(); } crono.Stop(); Console.WriteLine("\tGet " + randomKeys.Length.ToString() + " keys:\t\t" + crono.ElapsedMilliseconds.ToString() + "ms."); } static String[] GetRandomKeys(String[] col, Int32 count) { Random ran = new Random(DateTime.Now.Millisecond); List<Int32> indexSelection = new List<int>(); List<String> selection = new List<string>(); Int32 len = col.Length; while (indexSelection.Count < count) { Int32 value = ran.Next(0, len - 1); if (!indexSelection.Contains(value)) { indexSelection.Add(value); selection.Add(col[value]); } } return selection.ToArray(); } static IEnumerable<String> GenerateCollection(Int32 count) { for (int i = 0; i < count; i++) { yield return i.ToString("X").PadLeft(5, '0'); } } static IEnumerable<String> GenerateCollectionUnsorted(Int32 amount) { for (int i = 0; i < amount / 2; i++) { yield return i.ToString("X").PadLeft(5, '0'); yield return (amount-i).ToString("X").PadLeft(5, '0'); } } 

Any idea why this is happening?

EDIT 1: As I understand it, a sorted dictionary will add elements slower and get them faster because the collection is sorted. And also that comparing strings with OrdinalIgnoreCase or InvariantCultureIgnoreCase should be faster than a regular comparison, so searching should be faster. But maybe my understanding is completely wrong: D

Thanks in advance.

EDIT 2: As a curiosity, I did some tests to compare strings:

 Collection: 2147482 items. Random Keys: 1000 keys. CurrentCulture LookUp: 158209ms. CurrentCultureIgnoreCase LookUp: 160710ms. InvariantCulture LookUp: 132765ms. InvariantCultureIgnoreCase LookUp: 133409ms. Ordinal LookUp: 36115ms. OrdinalIgnoreCase LookUp: 36329ms. 
+4
source share
3 answers

Since only a regular dictionary with a standard comparator is actually a normal dictionary, all the rest are sorted dictionaries.

So, the result is very consistent.

Edit:

Given this, the result is different. :)

A sorted dictionary is slower when adding items, since they need to be sorted, but not faster when getting items. Finding a binary tree is quick, but finding a hash that uses a regular dictionary is faster. As the binary tree grows, there will be more steps to search for each item, while the dictionary basically grows, adding more buckets, so the number of comparisons is hardly affected.

String comparisons with Ordinal are faster than normal ( CurrentCulture ), and OrdinalIgnoreCase faster than CurrentCultureIgnoreCase , but not sure that OrdinalIgnoreCase faster than CurrentCulture . InvariantCulture Comparison is not faster than regular comparison, it just uses a different culture. The reason that comparing ordinates is much faster is because she doesn’t need to deal with cultural settings at all.

By the way, I noticed an error in the GetRandomKeys method. It will never select the last element, since you get a random number between 0 and length - 2.

+3
source

From the MSDN documentation on SortedDictionary :

The general class SortedDictionary<TKey, TValue> is a binary search tree with extraction O (log n), where n is the number of elements in the dictionary.

From docs to Dictionary :

Getting a value using its key is very fast, close to O (1), because the Dictionary<TKey, TValue> implemented as a hash table.

Therefore, it is not surprising that Dictionary in many cases be superior to SortedDictionary .

+3
source

I do not understand why you will be surprised by the results.

"normal" / unsorted will obviously be faster than a sorted dictionary, since sorting must perform additional operations.

Two "normal" ones that have additional options require additional processing.

+2
source

Source: https://habr.com/ru/post/1332143/


All Articles