Using string interpolation to reduce network client memory usage

I have a network client that processes data from a server.

The data is sent as a series of messages, which in themselves are collections of keys / values ​​similar in definition to HTTP headers (except for the absence of a "message body"), here is a typical one-way message (lines separated by the \r\n character):

 Response: OK Channel: 123 Status: OK Message: Spectrum is green Author: Gerry Anderson Foo123: Blargh 

My protocol client works by reading from NetworkStream , character by character, using StreamReader and while( (nc = rdr.Read()) != -1 ) , and uses a state parser and StringBuilder instance to populate Dictionary<String,String> instances. These Dictionary instances are then stored in internal memory structures for further processing; they usually have a useful lifespan of about 10 minutes each.

My client receives thousands of these messages per hour, and the long client process is a problem, because my client process often grows to consume more than 2 GB of memory from these String instances - I used windbg to see where all the memory went. This is a problem because the code runs on Azure VM with a memory capacity of 3.5 GB. I see no reason why my program should consume more than a few hundred MB of RAM. Often, I will sit in standby mode and monitor the memory consumption of my process over time, and it will gradually grow to about 2 GB, and then suddenly drop to about 100 MB when the GC completes its assembly and then grows again. Times can vary between GC cycles, without predictability.

Since many of these strings are identical (for example, Response , Status keys, etc.), as well as well-known values ​​such as OK and Fail , I can use string interning to reduce usage, like so:

 // In the state-machine parser after having read a Key name: String key = stringBuilder.ToString(); key = String.Intern( key ); // etc... after reading value messageDictionary.Add( key, value ); 

The problem is that I see a place for additional optimization: sb.ToString() is going to allocate a new instance of the string that will be used for interning, and secondly: interned strings for appdomain life, and, unfortunately, some of the keys do not will see reuse and will actually be wasting memory, like Foo123 in my example protocol.

One of the solutions I was thinking about is to not use string internationalization and instead have a class containing static readonly string fields that are known keys, and then use regular, non-interned strings, which will eventually be GC 'd, and therefore do not risk filling the internal row pool with one-time lines. Then I compared the StringBuilder instance with these famous strings, and if so, use them instead of calling sb.ToString() , thereby skipping another string distribution.

However, if I make a choice to put each line, the internal pool will continue to grow, and unfortunately .NET does not have a .Chlorinate() method for the line pool, is there a way to delete a single-line file, use the lines from the internal pool, if I continue with the String.Intern approach, or am I better using my own static read-only string instances?

+5
source share
1 answer

Interning will not help here, for the reasons you indicated. This will actually make matters worse, since interned strings are no longer garbage collected. And no, there is no way to remove interned strings from the pool.

You described that the GC does exactly what the GC is designed to do, so it’s not entirely clear to me that you really have a problem. Accepting internment would mean garbage collection (which is not a problem) for the ever-increasing demand for memory (which is a problem).

If you are concerned that the GC does not work often enough to lower memory consumption below a certain threshold, you might consider how to control your memory usage and call GC.Collect () when you reach that threshold.

If the GC behavior pattern actually causes a problem (other than strange), you can try switching from the default GC GC mode to the "server" GC mode, since they are configured differently. (But, again, I'm not at all convinced that you really have a problem.)

Some of the differences are described on these two pages:

http://msdn.microsoft.com/en-us/library/ee787088(v=vs.110).aspx#workstation_and_server_garbage_collection

http://blogs.msdn.com/b/dotnet/archive/2012/07/20/the-net-framework-4-5-includes-new-garbage-collector-enhancements-for-client-and-server- apps.aspx

But note that the actual differences change with each release of the framework, because the people responsible for this material are constantly learning and making improvements.

And GC mode is driven by application configuration:

http://msdn.microsoft.com/en-us/library/cc165011(v=office.11).aspx

 <configuration <runtime> <gcServer enabled="true"/> </runtime> </configuration> 

You may also find this troubleshooting guide useful or at least interesting:

http://msdn.microsoft.com/en-us/library/ee851764(v=vs.110).aspx#Issue_TooMuchMemory

+2
source

Source: https://habr.com/ru/post/1205346/


All Articles