Should I use the C # dictionary if I need only a quick key search and the values ​​are irrelevant?

I need a data type that can insert records, and then be able to quickly determine if a record has already been inserted. A Dictionary seems to fit this need (see Example). However, I do not use the values dictionary. Should I use a dictionary or is there another suitable data type?

 public class Foo { private Dictionary<string, bool> Entities; ... public void AddEntity(string bar) { if (!Entities.ContainsKey(bar)) { // bool value true here has no use and is just a placeholder Entities.Add(bar, true); } } public string[] GetEntities() { return Entities.Keys.ToArray(); } } 
+42
dictionary c # lookup
03 Mar. '17 at 17:14
source share
2 answers

You can use HashSet<T> .

The HashSet<T> class provides high-performance dialing operations. A set is a set containing non-duplicate elements and elements do not have a special order .

+83
Mar 03 '17 at 17:15
source share

Habib's answer is excellent, but for multi-threaded environments, if you use HashSet<T> , then with the last word you need to use lock to protect access to it, I find myself more prone to creating deadlocks with lock statements. In addition, lock gives the worst acceleration to Amdahl's law , because adding a lock statement reduces the percentage of your code that is actually parallel.

For these reasons, ConcurrentDictionary<T,object> matches the count in multi-threaded environments. If you finish using it, wrap it as in your question. Just a new up object to throw as values ​​as needed, as the values ​​will not be important. You can make sure that there are no lock statements in the source code .

If you do not need collection variability, then this will be a moot point. But your question implies that you need it, since you have an AddEntity method.

Additional information 2017-05-19 . In fact, ConcurrentDictionary uses internal locks, although not lock for statements - it uses Monitor.Enter (check out TryAddInternal ). However, it seems to block individual buckets in the dictionary, which means there will be less disputes than putting everything in a lock statement.

In general, ConcurrentDictionary often better suited for multi-threaded environments.

It’s actually quite difficult (impossible?) To make a parallel hash set using only Interlocked methods. I tried it on my own and constantly faced with the problem of the need to simultaneously change two things - something that only the lock as a whole can do. One workaround I found was to use singly linked lists for hash buckets and to deliberately create loops in the list when one thread was supposed to run on a node without interference from other threads; this will cause other threads to be caught rotating in the same place until this thread is executed using node and reduces the loop. Of course, he technically did not use locks, but he did not scale well.

+2
Mar 20 '17 at 20:28
source share



All Articles