Memory management / caching for expensive objects in C #

Suppose I have the following object

public class MyClass { public ReadOnlyDictionary<T, V> Dict { get { return createDictionary(); } } } 

Suppose ReadOnlyDictionary is a read-only wrapper around Dictionary<T, V> .

The createDictionary method takes considerable time to complete and return the dictionary is relatively large.

Obviously, I want to implement some kind of caching so that I can reuse the result of createDictionary , but also do not want to abuse the garbage collector and use a lot of memory.

I thought of using WeakReference for a dictionary, but not sure if this is the best approach.

What would you recommend? How to correctly process the result of an expensive method that can be called several times?

UPDATE:

I'm interested in advice for the C # 2.0 library (one DLL, not visual). The library can be used on the desktop of a web application.

UPDATE 2:

The question also applies to read-only objects. I changed the property value from Dictionary to ReadOnlyDictionary .

UPDATE 3:

T is a relatively simple type (for example, a string). V is a custom class. You can assume that instance V is expensive to create. A dictionary can contain from 0 to several thousand elements.

It is assumed that the code is accessible from one thread or from several threads with an external synchronization mechanism.

I am fine if the GC-ed dictionary when no one uses it. I am trying to find a balance between time (I want to somehow cache the result of createDictionary ) and memory costs (I do not want the memory to take longer than necessary).

+6
source share
5 answers

Four main mechanisms are available for you (Lazy is included in 4.0, so this is not an option)

  • lazy initialization
  • virtual proxy
  • ghost
  • value owner

each has its own advantages.

I suggest a value holder that populates a dictionary the first time the GetValue method is called by the holder. then you can use this value as long as you want. And it only runs once. And runs only when necessary.

for more information see the martin fauters page

+1
source

WeakReference is not a good cache solution, as you will not be able to survive in the next GC if no one references your dictionary. You can create a simple cache by storing the created value in a member variable and reuse it if it is not equal to zero.

This is not thread safe, and in some situations you end up creating a dictionary several times if you have heavy juicy access to it. You can use a double proven lock pattern to protect against this with minimal performance.

To help you further, you need to indicate whether concurrent access is a problem for you, and how much memory your dictionary consumes and how it is created. If, for example, the dictionary is the result of an expensive query that can help just serialize the dictionary to disk and reuse it until you need to recreate it (it depends on your specific needs).

Caching is another word for memory leak, if you do not have a clear policy when your object should be removed from the cache. Since you are trying to use WeakReference, I suppose you don’t know when it would be exactly good time to clear the cache.

Another option is to compress the dictionary into a less hungry structure. How many keys does your dictionary have and what are the meanings?

+3
source

Are you sure you need to cache the entire dictionary?

From what you're saying, it might be better to keep a list of key-value pairs used by most recently used ones.

If the key is found in the list, just return the value.

If this is not the case, create one value (which is supposedly faster than creating all of them and using less memory) and save it in the list, thereby deleting the key-value pair that was not used long.

Here's a very simple implementation of the MRU list, this may be an inspiration:

 using System.Collections.Generic; using System.Linq; internal sealed class MostRecentlyUsedList<T> : IEnumerable<T> { private readonly List<T> items; private readonly int maxCount; public MostRecentlyUsedList(int maxCount, IEnumerable<T> initialData) : this(maxCount) { this.items.AddRange(initialData.Take(maxCount)); } public MostRecentlyUsedList(int maxCount) { this.maxCount = maxCount; this.items = new List<T>(maxCount); } /// <summary> /// Adds an item to the top of the most recently used list. /// </summary> /// <param name="item">The item to add.</param> /// <returns><c>true</c> if the list was updated, <c>false</c> otherwise.</returns> public bool Add(T item) { int index = this.items.IndexOf(item); if (index != 0) { // item is not already the first in the list if (index > 0) { // item is in the list, but not in the first position this.items.RemoveAt(index); } else if (this.items.Count >= this.maxCount) { // item is not in the list, and the list is full already this.items.RemoveAt(this.items.Count - 1); } this.items.Insert(0, item); return true; } else { return false; } } public IEnumerator<T> GetEnumerator() { return this.items.GetEnumerator(); } System.Collections.IEnumerator System.Collections.IEnumerable.GetEnumerator() { return this.GetEnumerator(); } } 

In your case, T is a key-value pair. Keep maxcount small enough to keep searches fast and to avoid excessive memory usage. Call "Add" every time you use an item.

+1
source

An application should use WeakReference as a caching mechanism if the useful lifetime of the presence of an object in the cache is comparable to the reference lifetime of the object. Suppose, for example, that you have a method that creates ReadOnlyDictionary based on a String deserialization. If the general usage pattern is to read a line, create a dictionary, do something with it, abandon it and start on another line, WeakReference is probably not ideal. On the other hand, if your goal is to deserialize many lines (some of which will be equal) into ReadOnlyDictionary instances, this can be very useful if repeated attempts to deserialize the same line yield the same instance. Please note that the savings will not come only from the fact that only one had to do the job of creating the instance once, but also from the facts that (1) there would be no need to store several instances in memory and (2) if the variables ReadOnlyDictionary refers to the same instance, they can be known as equivalent without the need for an examination of the instances themselves. In contrast, determining whether two different ReadOnlyDictionary instances were equivalent may require examining all the elements in each. A code that would have to make many such comparisons could benefit from using the WeakReference cache, so that variables that contain equivalent instances usually have the same instance.

+1
source

I think that you have two mechanisms that you can rely on for caching, instead of creating your own. The first, as you yourself suggested, was to use WeakReference, and let the garbage collector decide when to free this memory.

You have a second mechanism - paging memory. If a dictionary is created in one fell swoop, it is likely to be stored in a more or less continuous part of the heap. Just keep the dictionary alive and let Windows output it to the swap file if you don't need it. Depending on your use (how random your access to the dictionary is), you may get better performance than WeakReference.

This second approach is problematic if you are close to the restrictions on the address space (this happens only in 32-bit processes).

0
source

Source: https://habr.com/ru/post/917738/


All Articles