Linq memory performance

I have a list: collection users who have about 100K + user records (all user objects are fully loaded from the database with fields such as Bio, first name, last name, etc.). This collection is retrieved at application startup from the database and stored in memory.

Then I have code like:

User cachedUser = users.FirstOrDefault(x => string.Equals(x.UserName, username, StringComparison.CurrentCultureIgnoreCase)); 

What I use to extract users from this collection. But for some reason, I noticed that this operation is incredibly slow. Is there a performance issue when using Linq to query a collection of large objects? Should I instead access the database every time I want to get a user?

+6
source share
4 answers

If you want to optimize the response time, and you can create a Dictionary<T,U> and search for the user inside:

  Dictionary<string, User> usersDictionary = new <Dictionary<string, User>(StringComparer.CurrentCultureIgnoreCase); // After querying the users from the DB add them to the dictionary usersDictionary.Add(user.UserName, user); // Then when you need to retrieve a user User retrieveUser = null; usersDictionary.TryGetValue(username, out retrieveUser); 

Hope this helps!

+3
source

I think you may need to rethink your architecture based on the information you gave us. Use the database and give it a search option for you. Observe, measure and make appropriate changes afterwards. You can understand that you prematurely optimized all of this.

+8
source

Your LINQ query, like any other iteration method (loop, search in an array), will access each individual record until the requested record is found. In the worst case, this means 100k comparisons. To make it faster, you have the following options:

  • use a sorted list or dictionary: binary search is much faster. Sort data when retrieving from database using ORDER BY
  • use a dataset. It is similar to the In-Memory database, which provides a quick search.
  • Leave the data in the database and set the appropriate indexes for quick access.

I suggest using the database for the following reasons:

  • This is a waste of memory for storing 100 thousand records, which you probably never use.
  • Once you change your data, you will need to update your cache, which can be quite complicated.
  • Web applications are multithreaded (each request is executed in its own thread). If you change your data, you will have to synchronize with locks.
  • a database can cache frequently called data
  • you need to write less code
  • you have a stateless web application that scales better (web farms)
  • your application probably has other data, you cannot store everything in memory
+3
source

The differences in search performance that you notice are that the database uses indexing to search for strings in the database, but in memory you just go through all the records until you find them. In addition, the database stores the hash number for the string and looks for this hash address, which is much faster and does not actually compare strings.

Dictionary<> also does indexing, but has a delay for adding data when the beginning of the data begins, because when it adds some data, each time it searches where it can be placed at the correct index point.

In addition, the database caches the results, many database caches also index and generate additional statistics that help you quickly find what you are looking for.

It is better to let the database search, unless you can do something faster for additional custom cases.

0
source

Source: https://habr.com/ru/post/918583/


All Articles