Sometimes a little piece of seemingly innocuous code can cause a significant amount of trouble:

[code language=”csharp”]
public byte[] Serialize(object o)
{
using (var stream = new MemoryStream())
{
MySerializer.Serialize(stream, o);
return stream.ToArray();
}
}
[/code]

Doesn’t look like much, but I’m sure we have all written something like this and paid it no mind. In fact, most of the time, code like this isn’t a problem.

This little method becomes a problem when that object o parameter is larger than 85k (actually 84,988 bytes) in contiguous memory. At that point, this object is considered to be a “large object.” That designation is significant since when you’re done with an object like this, the .Net garbage collector stores it on the Large Object Heap (LOH) rather than in the gen 0 heap.

A situation called LOH fragmentation arises when applications are frequently creating and disposing many large objects. A classic example is a file upload service that receives and processes large volumes of data in each request. The resulting fragmentation can lead to an OutOfMemoryException being thrown, even if the server still has plenty of memory and there’s plenty of space left in the process address space.

So, getting back to our little innocuous piece of code: we already know that if o is a large object, it will consume space in the LOH.

However…There’s More

We are actually allocating more than one large object. This subtlety is nefarious and can catch you completely unaware until your process crashes while using not that much memory.

[code language=”csharp”]
var stream = new MemoryStream()
[/code]

The memory stream will allocate a buffer to hold the serialized bytes. This buffer will typically be a large object as well (depending upon the serialization technique and whether compression is being used). That’s potentially another entry into the LOH!

[code language=”csharp”]
MySerializer.Serialize(stream, o);
[/code]

You may have a custom-written serializer or one written by a third party. What is the serializer doing internally? It may be allocating additional buffers. This may be a source of still more LOH entries.

[code language=”csharp”]
return stream.ToArray();
[/code]

Finally, we’re done with the stream and we are returning the array of serialized bytes. However, the ToArray() method makes a copy of the bytes in the MemoryStream’s buffer; making, you guessed it, another large object and another entry in the LOH.

That innocent 4-line method can potentially create 3 (or more) additional large objects in addition to the one passed in. If you’re doing this very frequently, you’re bound to fragment the LOH sooner or later. In fact, even if you’re not, in the long term you may see memory climbing and never being relinquished. This may cause your web application to recycle or crash.

There’s also something else super surprising and bizarre about LOH allocations: they are all done in the gen 2 heap! That means they are very long lived and put even more surprise memory pressure on your process. In computer time, gen 2 collections take place separated by minutes whereas gen 0 collections are separated by milliseconds. Minutes compared to milliseconds is eons in computer time.

The Large Object Heap’s Dirty Little Secret

The .NET garbage collector is fantastically good at optimizing the small object heaps and reclaiming memory by compacting them when necessary. However, for the LOH, the GC effectively ignores this clean-up process since it can be a very expensive procedure; this was a good design decision by Microsoft. The GC needs to move around and organize large blocks of memory and that can consume lots of CPU cycles. Microsoft has always stressed that developers should be careful with large objects, using them sparingly or employing pooling to reduce the number of allocations necessary.

That’s all great advice, but when you have massive applications doing huge processing over many days, no matter how careful you are with the LOH, you can still run into this LOH fragmentation, like I did recently. After 10 days of number crunching on a massive analysis application, we hit the dreaded OutOfMemoryException. All evidence in the investigation pointed to a fragmented LOH situation.

The great news is that with version 4.5.1 of the .NET framework, Microsoft introduced a little-advertised piece of functionality to allow compaction of the LOH upon request. This addition was added exactly for the situation we faced: a long-running process doing massive amounts of data manipulation; in our case, manipulation of large arrays of integers. In order to compact the LOH and free unused memory in .Net 4.5.1, all you need to execute is the following:

[code language=”csharp”]
GCSettings.LargeObjectHeapCompactionMode = GCLargeObjectHeapCompactionMode.CompactOnce;
GC.Collect();
[/code]

Technically, the explicit GC.Collect() is only needed to immediately request compaction of the LOH. Setting the LargeObjectHeapCompactionMode to CompactOnce will cause the LOH to be compacted on the next garbage collection. After compaction, the mode will be reset automatically to the default behavior of not compacting the LOH on garbage collections.

Remember, this has a non-zero impact on performance; so use it very sparingly.

Is There A Right Way?

Well, quite frankly, Microsoft’s original guidance is the better way. Reduce the number of large objects you’re creating by breaking them into smaller components, or manage your buffers by pooling them for reuse.

In the case of the method at the beginning of this article, adding a buffer pool would definitely help; it may not be feasible to break this object into smaller components. Pooling will allow you to allocate some buffers in advance, use them, and return them to the pool when finished.

There may be cases where re-architecting your objects or adding pooling simply isn’t feasible for any number of reasons. In those situations, you’ll need to manually compact the LOH. However, always look to reduce use of large objects or pool buffers first.

In Conclusion

Be mindful of your objects. Ensure they are small enough to be handled by the small object heaps. If you do need large objects, try to reuse them as much as possible. And, remember, all it takes is a few lines of seemingly harmless code that can devastate your large object heap and consume big chunks of RAM.

One way to learn more about the garbage collector is to take Jeffrey Richter’s Mastering the .Net Framework class offered here at Wintellect.

To learn how to analyze the heap in more detail, look for John Robbin’s Mastering .NET Debugging to dive into WinDbg and SOS.

And for further reading, here’s a bunch of links (in no particular order):