C #: Why is a dictionary so much faster than a list?

Question

C #: Why is a dictionary so much faster than a list?

I am testing the speed of getting data from the Dictionary VS list.
I used this code for testing:

internal class Program { private static void Main(string[] args) { var stopwatch = new Stopwatch(); List<Grade> grades = Grade.GetData().ToList(); List<Student> students = Student.GetStudents().ToList(); stopwatch.Start(); foreach (Student student in students) { student.Grade = grades.Single(x => x.StudentId == student.Id).Value; } stopwatch.Stop(); Console.WriteLine("Using list {0}", stopwatch.Elapsed); stopwatch.Reset(); students = Student.GetStudents().ToList(); stopwatch.Start(); Dictionary<Guid, string> dic = Grade.GetData().ToDictionary(x => x.StudentId, x => x.Value); foreach (Student student in students) { student.Grade = dic[student.Id]; } stopwatch.Stop(); Console.WriteLine("Using dictionary {0}", stopwatch.Elapsed); Console.ReadKey(); } } public class GuidHelper { public static List<Guid> ListOfIds=new List<Guid>(); static GuidHelper() { for (int i = 0; i < 10000; i++) { ListOfIds.Add(Guid.NewGuid()); } } } public class Grade { public Guid StudentId { get; set; } public string Value { get; set; } public static IEnumerable<Grade> GetData() { for (int i = 0; i < 10000; i++) { yield return new Grade { StudentId = GuidHelper.ListOfIds[i], Value = "Value " + i }; } } } public class Student { public Guid Id { get; set; } public string Name { get; set; } public string Grade { get; set; } public static IEnumerable<Student> GetStudents() { for (int i = 0; i < 10000; i++) { yield return new Student { Id = GuidHelper.ListOfIds[i], Name = "Name " + i }; } } }

In memory there is a list of students and classes in which they have StudentId.
First I tried to find the student class using LINQ in a list that took about 7 seconds on my machine, and in another way I first converted the List to a dictionary, and then found the student classes from the dictionary using a key that took less than a second. enter image description here

+45

performance c # .net

Unforgiven Jun 07 '13 at 6:33

source share

7 answers

When using the dictionary you use the key to get your information, which allows it to find it more efficiently, when using List, you use the Single Linq expression, which, since it is a list, has no other option than viewing the entire list for the required element.

+11

Ron.BI Jun 07 '13 at 6:37

source share

The reason is that the dictionary is a search and the list is an iteration.

The dictionary uses a hash search, while your list requires iterating through the list until it finds a result from the beginning to the result each time.

to say otherwise. The list will be faster than the dictionary on the first element, because there is nothing to look for. this is the first item boom .. it's done. but the second time the list should look at the first element, then the second element. The third time through it you need to view the first element, then the second element, then the third element, etc.

Thus, each iteration of the search takes more and more time. The larger the list, the longer it takes. Although the dictionary is always a more or less fixed search time (it also increases as the dictionary grows, but it is much slower, so it is almost fixed during comparison).

+9

Erik Funkenbusch Jun 07 '13 at 6:43

source share

The dictionary uses hashing to search for data. Each element in the dictionary is stored in buckets of elements that contain the same hash. It is much faster.

Try sorting your list, it will be a little faster.

+8

Quinton Bernhardt Jun 07 '13 at 6:38

source share

The hash table is used in the dictionary, this is an excellent data structure, since it instantly maps the input signal to the corresponding output, it has complexity O (1), as already indicated, which means more or less direct search.

The disadvantages of this are that for performance you need a lot of space in advance (depending on the implementation, whether it is a separate chain or linear / quadratic sounding, you may need at least as much as you plan to store, probably double in the latter case), and you need a good hashing algorithm that uniquely maps your input ( "John Smith" ) to the appropriate output, such as a position in the array ( hash_array[34521] ).

Also listing records in sorted order is a problem. If I can quote Wikipedia:

A list of all n entries in a specific order usually requires a separate sorting step, the cost of which is proportional to log (n) for each entry.

Read the linear research and a separate chain for some gorier details :)

+5

Nobilis Jun 07 '13 at 12:13

source share

The dictionary is based on a hash table, which is a fairly effective algorithm for finding things. In the list you need to go element by element in order to find something.

It's all about data organization ...

+3

JeffRSon Jun 07 '13 at 6:38

source share

When it comes to data mining, a key collection is always faster than a keyless collection. This is due to the fact that for an unskilled collection you will have to list its elements in order to find what you are looking for. Although in the collection with the key, you can simply access the element directly using the key.

Here are some good articles to compare your list with a dictionary.

Here . And this one .

+2

aiapatag Jun 07 '13 at 7:00

source share

Patashu · Accepted Answer · 2013-06-07 06:38

When you do this:

student.Grade = grades.Single(x => x.StudentId == student.Id).Value;

As indicated, it should list the entire List until it finds an entry in the list that has the correct studentId (does entry 0 match lambda? No ... Does entry 1 fit into lambda? No ... etc.). This is O (n). Since you do this once for each student, this is O (n ^ 2).

However, when you do this:

student.Grade = dic[student.Id];

If you want to find a specific element by the key in the dictionary, it can instantly go where it is in the dictionary - this is O (1). O (n) for this for each student. (If you want to know how this is done, the dictionary performs a mathematical operation on the key, which turns it into a value, which is the place inside the dictionary, which is the same place that it placed when it was inserted)

So, the dictionary is faster because you used the best algorithm.

C #: Why is a dictionary so much faster than a list?

More articles: