Extensive use of LOH causes significant performance issues

We have a web service using WebApi 2, .NET 4.5 on the 2012 server. We observed that from time to time it increases by 10-30 ms for no good reason. We were able to track the problematic piece of code to LOH and GC.

There is some text that we will convert to its representation in the form of UTF8 (in fact, the serialization library we use). While the text is shorter than 85,000 bytes, latency is stable and short: ~ 0.2 ms on average and 99%. As soon as the 85000 border crosses, the average latency increases to ~ 1 ms, and 99% to 16-20 ms. The profiler shows that most of the time is spent in the GC. To be sure that if I put GC.Collect between iterations, the measured delay will return to 0.2 ms.

I have two questions:

  • Where does latency come from? As I understand it, LOH does not condense. SOH is condensed, but does not show a delay.
  • Is there a practical way around this? Note that I cannot control the size of the data and reduce it.

-

public void PerfTestMeasureGetBytes() { var text = File.ReadAllText(@"C:\Temp\ContactsModelsInferences.txt"); var smallText = text.Substring(0, 85000 + 100); int count = 1000; List<double> latencies = new List<double>(count); for (int i = 0; i < count; i++) { Stopwatch sw = new Stopwatch(); sw.Start(); var bytes = Encoding.UTF8.GetBytes(smallText); sw.Stop(); latencies.Add(sw.Elapsed.TotalMilliseconds); //GC.Collect(2, GCCollectionMode.Default, true); } latencies.Sort(); Console.WriteLine("Average: {0}", latencies.Average()); Console.WriteLine("99%: {0}", latencies[(int)(latencies.Count * 0.99)]); } 
+5
source share
2 answers

Performance problems typically arise from two areas: distribution and fragmentation.

Selection

Runtime guarantees clean memory, so it runs cleaning cycles. When you allocate a large object, this is most of the memory and starts adding milliseconds to a single distribution (to be honest, a simple distribution in .NET is actually very fast, so we usually never care about that).

Fragmentation occurs when LOH objects are distributed and then regenerated. Until recently, the GC could not reorganize memory to remove these โ€œspacesโ€ of the old object, and thus could only match the next object in this gap if it was the same size or smaller. Recently, GC was able to seal LOH, which eliminates this problem, but takes time during compaction.

My assumption in your case is that you suffer from both problems and run GC sessions, but it depends on how often your code tries to highlight elements in LOH. If you do a lot of distributions, try the route of the object pool. If you cannot manage the pool efficiently (lumpy object lifetime or disparate usage patterns), try splitting the data you are working with to completely avoid this.

<h / "> Options

I came across two approaches to LOH:

  • Avoid this.
  • Use it, but understand that you use it and control it explicitly.

Avoid this

This includes fragmenting your large object (usually some array) into, well, pieces that fall under the LOH barrier. We do this by serializing large flows of objects. Works well, but the implementation will be specific to your environment, so I hesitate to provide a coded example.

Use him

A simple way to solve both distribution and fragmentation are objects with a long service life. Explicitly make an empty array (or arrays) of large size to accommodate your large object and do not dispose of it (or them). Leave it and reuse it as a pool of objects. You pay for this distribution, but you can do it either the first time you use it or while the application is idle, but you pay less for redistribution (because you do not redistribute) and reduce fragmentation problems because you do not constantly request allocation of material, and you do not return items (which primarily causes gaps).

However, maybe halfway. Reserve a memory partition for the object pool. Completed earlier, these allocations must be contiguous in memory, so you wonโ€™t get any spaces and leave a tail of available memory for uncontrolled elements. Beware, although this obviously affects the working set of your application - the pool of objects takes up space, regardless of whether it is used or not.

<h / "> Resources

LOH is widely used on the Internet, but pay attention to the date of the resource. In recent versions of .NET, LOH got some love and improved. However, if you are using an older version, I think the resources on the network are pretty accurate, since LOH has never received any major updates since the creation of .NET 4.5 (ish).

For example, there is this article from 2008 http://msdn.microsoft.com/en-us/magazine/cc534993.aspx

And a summary of the improvements in .NET 4.5: http://blogs.msdn.com/b/dotnet/archive/2011/10/04/large-object-heap-improvements-in-net-4-5.aspx

+5
source

In addition to the following, make sure you are using the server garbage collector . This does not affect how LOH is used, but my experience is that it significantly reduces the time spent on GC.

The best work I have found to avoid the big problems of the heap of objects is to create a persistent buffer and reuse it. Therefore, instead of allocating a new byte array with each call to Encoding.GetBytes , pass the byte array to the method.

In this case, use GetBytes overload , which takes a byte array. Select an array large enough to hold bytes for your longest expected string and save it. For instance:

 // allocate buffer at class scope private byte[] _theBuffer = new byte[1024*1024]; public void PerfTestMeasureGetBytes() { // ... for (...) { var sw = Stopwatch.StartNew(); var numberOfBytes = Encoding.UTF8.GetBytes(smallText, 0, smallText.Length, _theBuffer, 0); sw.Stop(); // ... } 

The only problem is that you have to make sure your buffer is large enough to contain the largest string. What I have done in the past is to allocate the buffer to the largest size that I expect, but then check that it is large enough when I use it. If it is not large enough, redistribute it. How you do this depends on how strict you are. When working with predominantly Western European text, I would simply double the length of the string. For instance:

 string textToConvert = ... if (_theBuffer.Length < 2*textToConvert.Length) { // reallocate the buffer _theBuffer = new byte[2*textToConvert.Length]; } 

Another way to do this is to simply try GetString and redistribute on error. Then try again. For instance:

 while (!good) { try { numberOfBytes = Encoding.UTF8.GetString(theString, ....); good = true; } catch (ArgumentException) { // buffer isn't big enough. Find out how much I really need var bytesNeeded = Encoding.UTF8.GetByteCount(theString); // and reallocate the buffer _theBuffer = new byte[bytesNeeded]; } } 

If you make the initial buffer size large enough to accommodate the largest line you expect, then you probably won't get this exception very often. This means that the number of times you need to reallocate the buffer will be very small. You could, of course, add an addition to bytesNeeded so that you allocate more if you have other outliers.

+3
source

Source: https://habr.com/ru/post/1208670/


All Articles