Large Object Heap fragmentation causes OutOfmemoryException - by Rüdiger Klaehn

Status : 

  Won't Fix<br /><br />
		Due to several factors the product team decided to focus its efforts on other items.<br /><br />
		A more detailed explanation for the resolution of this particular item may have been provided in the comments section.


46
0
Sign in
to vote
ID 521147 Comments
Status Closed Workarounds
Type Bug Repros 25
Opened 12/18/2009 3:01:10 AM
Access Restriction Public

Description

In some situations, the large object heap will become heavily fragmented. The amount of fragmentation is so large that the LOH uses more than twice the amount of memory than the totall sum of allocated large objects. 

The problem is described in detail in this article: 
http://www.simple-talk.com/dotnet/.net-framework/the-dangers-of-the-large-object-heap/

It really takes a large amount of chutzpah to release a product with such a major flaw. It basically makes it almost impossible to write long-running server processes in .NET. This issue almost cost my company a contract since we were experiencing inexplicable OutOfmemoryExceptions in a very important product. 

The problem can be solved by 
a) treating large objects like normal objects and compacting the LOH during a generation 2 collection or 
b) using a classic heap management algorithm like the one from Donald Knuth to ensure that the total size of the heap never exceeds two times the total number of allocated bytes. 

I don't expect microsoft to do anything about this issue. After all, they are busy adding new language features like dynamic to C# in order to make it more buzzword compliant. And they did not do anything about any of the other highly rated feedback items I have submitted. But I think everybody should know about this issue so that they can avoid the .NET platform for important applications.
Sign in to post a comment.
Posted by Doctor J on 5/6/2013 at 12:59 PM
Using .NET 4.5 32-bit process on 64-bit machine:

With large blocks: 1714Mb allocated
With large blocks, frequent garbage collections: 1714Mb allocated
Only small blocks: 1787Mb allocated
With large blocks, large blocks not growing: 1779Mb allocated
Posted by SamCPP on 7/23/2012 at 9:45 PM
I see this blog post but not sure if this allows defragging of the LOH.
http://blogs.msdn.com/b/dotnet/archive/2012/07/20/the-net-framework-4-5-includes-new-garbage-collector-enhancements-for-client-and-server-apps.aspx

In any case, it appears to be positive movement for the framework and I hope work continues in this area - particularly if fragmentation of the LOH can still result in OutOfMemoryExceptions.
Posted by SamCPP on 1/12/2012 at 4:12 PM
"Very disapointed to find out that this is an issue so late in our development cylce!"

Yes for large dataset applications, you start thinking is .NET a serious player in this market?
Posted by SamCPP on 1/10/2012 at 2:32 PM
I can also confirm the LOH fragmentation is also still an issue in our application in dot net 4. We are getting OutOfMemoryExceptions when mem usage is approximately 1.3GB when running as x86 (some of our client's production and test systems are 32-bit so 64-bit is not currently an option).
Posted by Connor Douglas on 8/16/2011 at 5:31 AM
This problem has caused me serval sleepless nights and is currently delaying a project from going into production. I don't understand why microscoft will not look at this problem. I am dealing with heavy image processing application with large arrays.

The application is meant to run periods of years without being restarted.

Very disapointed to find out that this is an issue so late in our development cylce!
Posted by cfneese on 4/11/2011 at 10:54 PM
In .NET 4.0 the test code below does indeed fragment the LOH. However, there are several details in the article at http://www.simple-talk.com/dotnet/.net-framework/the-dangers-of-the-large-object-heap/ that are not correct with .NET 4.0. Most importantly, the small blocks are not added to the "top" of the heap. You can use the following code just after the catch block of the Fill function to see this:

         unsafe
            {
                var w = new StreamWriter(@".\test.txt");
                for (int i = 0; i < count; i++)
                {
                    var handle = GCHandle.Alloc(smallBlocks[i], GCHandleType.Pinned);
                    w.WriteLine(String.Format("{0,10}\t{1,10}", i, handle.AddrOfPinnedObject()));
                    handle.Free();
                }
                w.Close();
            }
Investigation of the resulting "test.txt" file shows that the small blocks are not added in the order described by the reference article. The fragmentation occurs because none of the small buffers are GCed, and after about 6000 of them have been allocated there aren't any ~16MB contiguous segments left. But if some of the small buffers are GCed, this doesn't happen.

Indeed, replacing
            List<byte[]> smallBlocks = new List<byte[]>();
with
            byte[][] smallBlocks = new byte[1000][];
and
                    smallBlocks.Add(new byte[blockSize]);
with
                    smallBlocks[count%1000] = new byte[blockSize];
allows the cited example to run almost indefinitely.


It would be nice if there were a command to manually force the LOH to defragment, but given the way .NET 4.0 uses the LOH, the current example is unrealistically harsh.
Posted by B. David Holt on 4/7/2011 at 4:51 AM
Simply allowing us to call a function like

GC.CompactLargeObjectHeap()

would really be amazing.


Or have a setting like GC.CompactLargeObjectHeapOnMemoryException that we could set to true.

That would leave the current behavior intact for everyone else, but let those of us who desperately need to address this issue do it.
Posted by jonbnews on 9/30/2010 at 2:35 PM
As suggested in one of the comments in the original article: http://www.simple-talk.com/dotnet/.net-framework/the-dangers-of-the-large-object-heap/

I think it would be useful to provide some sort of API to force compaction of the LOH if so desired or to have some setting like GC.DoLargeObjectHeapCompactionWhenOutOfMemory = true (which could be set to false by default). When the runtime detects OutOfMemory on the LOH, it could compact the heap and then re-try the allocation if this settings was true. In addition, perhaps an API like GC.CompactLargeObjectHeap() would not be unthinkable.

This problem occurs surprisingly often in real-world long-running server-side programs.
Posted by dthouston on 5/13/2010 at 8:07 AM
We circumvented this problem in a .NET 2.0 app by pre-allocating static objects on the LOH, and then re-using them. In our case, by using ADPlus and Windbg w/SOS we found that the OOM was being caused by decompressing incoming very large byte arrays, so we pre-allocated the MemoryStream (LOH_MS) which holds the incoming data and the byte array (LOH_BA) which holds the decompressed data. Not particularly elegant, but it keeps the app from throwing OOM exceptions!

// NOTE that the following (at least as presently written) is not thread-safe!

// Write incoming byte array ba into pre-allocated MemoryStream LOH_MS in preparation for decompressing it
LOH_MS.Position = 0;
LOH_MS.SetLength(0);

// write the incoming byte array to a MemoryStream
LOH_MS.Write(ba, 0, ba.Length);

// Decompress MemoryStream LOH_MS into pre-allocated byte array LOH_BA
    
// set MemoryStream at its beginning again (we're about to read from it)
LOH_MS.Position = 0;
    
// create decompress-mode DeflateStream to read from LOH_MS
DeflateStream dsDecompress = new DeflateStream(LOH_MS, CompressionMode.Decompress, true);
    
// decompress from MemoryStream LOH_MS into byte array LOH_BA
int bytesRead = dsDecompress.Read(LOH_BA, 0, dpSerLen);
dsDecompress.Close();
dsDecompress = null;

// remember how much of LOH_BA we're using
int lohBAinUse = dpSerLen;
    
// Write byte array LOH_BA into MemoryStream LOH_MS to be returned to caller
LOH_MS.Position = 0;
LOH_MS.SetLength(0);
    
// Write the uncompressed byte array LOH_BA into MemoryStream LOH_MS
LOH_MS.Write(LOH_BA, 0, lohBAinUse);
    
// position MemoryStream at its beginning again (caller is about to read from it)
LOH_MS.Position = 0;
return LOH_MS;
Posted by Edsger on 5/5/2010 at 10:35 AM
I was able to reproduce the problem with visual studio 2010 (final).

These are the results on a Windows 7 machine.

With large blocks: 630Mb allocated
With large blocks, frequent garbage collections: 590Mb allocated
Only small blocks: 1819Mb allocated
With large blocks, large blocks not growing: 638Mb allocated

Posted by Softlion on 3/19/2010 at 9:53 AM
On my 32 bits vista laptop with .NET 2 / 3GB:

With large blocks: 21Mb allocated
With large blocks, frequent garbage collections: 26Mb allocated
Only small blocks: 1827Mb allocated
With large blocks, large blocks not growing: 707Mb allocated
Posted by Softlion on 3/19/2010 at 9:45 AM
Oups sorry.
In fact that means .NET 4 is worst than .NET 3.5 :(
Posted by Softlion on 3/19/2010 at 9:38 AM
Hi,
just run the sample with VS2010 RC on a 32 bit platform:

With large blocks: 418Mb allocated
With large blocks, frequent garbage collections: 598Mb allocated
Only small blocks: 1819Mb allocated
With large blocks, large blocks not growing: 422Mb allocated

So only "small blocks" are still using too much memory ?
Posted by arghhhhhhhhhhh on 3/13/2010 at 4:37 AM
Just stumbled upon this problem too, have a client/server app where the server generates lots of data and sends then to the client in 5MB blocks using remoting. Sprinkling GC.Collect() around where the OOM happens seems to "fix" it, but the GC is obviously not doing its job, favoring throwing OutOfMemoryException instead of collecting memory...

This is a critical bug in .NET 2.0/3.5, can't believe it's not fixed.
Posted by CatZimmermann on 2/27/2010 at 1:36 PM
Nels, on the .NET 4 RC I see the twenty-fold increase the VS team is claiming. From ~20MB to ~550MB. It's not as dismal as before.
Posted by Nels Olsen1 on 2/17/2010 at 1:46 PM
I just upgraded to VS 2010 RC1 and get the same dismal result ...
Posted by Nels Olsen1 on 2/2/2010 at 1:17 PM
We're using VS2010 Beta2 with .NET 4 and still see this problem. When you say this is fixed in .NET 4, are you referring to the upcoming production release? When can we expect that?
Posted by Brandon [MSFT] on 1/20/2010 at 3:47 PM
Hi Rüdiger,

I'm the lead program manager responsible for the garbage collector in the .NET Framework. Thank you for reporting the issue you found. Upon investigating this, we believe that a fix was already made to the garbage collector in .NET 4 that should address this problem with the large object heap.

Based on the example provided, we were able to allocate nearly 23 times as much memory before running out of memory on the large object heap going from version 3.5 to version 4. That's not to say we are finished addressing fragmentation issues—we will continue to pay attention as we improve in future versions. In the .NET 4 release, we heard from customers that latency was a high priority. So that is where we have spent much of our focus.

Unfortunately, this does mean that for .NET 4 we will not be making any further changes to address this bug report. We are currently assessing what the priorities are for the next version of .NET, and I would enjoy your feedback. Please feel free to send me mail at brandon.bray@microsoft.com with anything you believe would help us build a better product.

Thank you again for making us spend time thinking about your scenario,
Brandon Bray
Posted by Brandon [MSFT] on 1/20/2010 at 3:47 PM
Hi Rüdiger,

I'm the lead program manager responsible for the garbage collector in the .NET Framework. Thank you for reporting the issue you found. Upon investigating this, we believe that a fix was already made to the garbage collector in .NET 4 that should address this problem with the large object heap.

Based on the example provided, we were able to allocate nearly 23 times as much memory before running out of memory on the large object heap going from version 3.5 to version 4. That's not to say we are finished addressing fragmentation issues—we will continue to pay attention as we improve in future versions. In the .NET 4 release, we heard from customers that latency was a high priority. So that is where we have spent much of our focus.

Unfortunately, this does mean that for .NET 4 we will not be making any further changes to address this bug report. We are currently assessing what the priorities are for the next version of .NET, and I would enjoy your feedback. Please feel free to send me mail at brandon.bray@microsoft.com with anything you believe would help us build a better product.

Thank you again for making us spend time thinking about your scenario,
Brandon Bray
Posted by Rüdiger Klaehn on 12/23/2009 at 2:32 AM
I think the best solution would be to just eliminate the large object heap completely. Maybe have some special treatment for large objects like not using the nursery, but other than that treat them like any other object:

Moving even large objects on a modern CPU takes milliseconds. On my 2 year old Q6600, copying 128MB takes just 60ms. Most applications do not use such large objects. And for those that do use them, an occasional 60ms latency is much preferable to erratic and unpredictable OutOfMemoryExceptions!


        static void CopyTest(int n)
        {
            var src = new byte[n];
            var tgt = new byte[n];
            var t0 = DateTime.UtcNow;
            for (var i = 0; i < 1000; i++)
                Buffer.BlockCopy(src, 0, tgt, 0, src.Length);
            var dt = DateTime.UtcNow - t0;
            Console.WriteLine("Copying {0} bytes takes {1}", n, dt.TotalSeconds/1000);
        }

Copying 134217728 bytes takes 0.067965
Posted by Rüdiger Klaehn on 12/21/2009 at 6:48 AM
I just ported the program to java. Here is the output with a maximum heap size of 1024 megabytes:

With large blocks: 975 Mb allocated
With large blocks, frequent garbage collections: 975 Mb allocated
Only small blocks: 1015 Mb allocated
With large blocks, large blocks not growing: 975 Mb allocated

So it seems that the java VM has no problem with large objects.

---

package loh_test;

import java.util.ArrayList;
import java.util.List;

public class Main {

    /// <summary>
    /// Static variable used to store our 'big' block. This ensures that the block is always up for garbage collection.
    /// </summary>
    static byte[] bigBlock;

    /// <summary>
    /// Allocates 90,000 byte blocks, optionally intersperced with larger blocks
    /// </summary>
    static void fill(boolean allocateBigBlocks, boolean grow, boolean alwaysGC) {

        // Number of bytes in a small block
        // 90000 bytes, just above the limit for the LOH
        final int blockSize = 90000;
        
        // Number of bytes in a larger block: 16Mb initially
        int largeBlockSize = 1 << 24;

        // Number of small blocks allocated
        int count = 0;

        try {

            // We keep the 'small' blocks around
            // (imagine an algorithm that allocates memory in chunks)

            List<byte[]> smallBlocks = new ArrayList<byte[]>();

            for (;;) {

                // Write out some status information
                if ((count % 1000) == 0) {
                    System.out.println(count);
                }

                // Force a GC if necessaryry
                if (alwaysGC) {
                    Runtime.getRuntime().gc();
                }

                // Allocate a larger block if we're set up to do soso
                if (allocateBigBlocks) {
                    bigBlock = new byte[largeBlockSize];
                }

                // The next 'large' block will be just slightly largerer
                if (grow) {
                    largeBlockSize++;
                }

                // Allocate a new block
                smallBlocks.add(new byte[blockSize]);

                count++;

            }

        } catch (OutOfMemoryError err) {

            // Force a GC, which should empty the LOH again
            bigBlock = null;
            Runtime.getRuntime().gc();

            // Display the results for the amount of memory we managed to allocate
            System.out.println(String.format("%s: %s Mb allocated", (allocateBigBlocks ? "With large blocks" : "Only small blocks") + (alwaysGC ? ", frequent garbage collections" : "") + (grow ? "" : ", large blocks not growing"), (count * blockSize) / (1024 * 1024)));

        }

    }

    /**
     * @param args the command line arguments
     */
    public static void main(String[] args) {
        // Display results for cases both with and without the larger blocks
        fill(true, true, false);
        fill(true, true, true);
        fill(false, true, false);
        fill(true, false, false);
    }
}
Posted by Microsoft on 12/20/2009 at 8:13 PM
Thanks for your feedback.

We are rerouting this issue to the appropriate group within the Visual Studio Product Team for triage and resolution. These specialized experts will follow-up with your issue.

Thank you
Posted by Rüdiger Klaehn on 12/18/2009 at 3:16 AM
I tried to attach the program to reproduce this, but I don't see it. So here is the code:

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;

namespace LOH_test
{
    class Program
    {
        /// <summary>
        /// Static variable used to store our 'big' block. This ensures that the block is always up for garbage collection.
        /// </summary>
        static byte[] bigBlock;

        /// <summary>
        /// Allocates 90,000 byte blocks, optionally intersperced with larger blocks
        /// </summary>
        static void Fill(bool allocateBigBlocks, bool grow, bool alwaysGC)
        {
            // Number of bytes in a small block
            // 90000 bytes, just above the limit for the LOH
            const int blockSize = 90000;

            // Number of bytes in a larger block: 16Mb initially
            int largeBlockSize = 1 << 24;

            // Number of small blocks allocated
            int count = 0;

            try
            {
                // We keep the 'small' blocks around
                // (imagine an algorithm that allocates memory in chunks)
                List<byte[]> smallBlocks = new List<byte[]>();

                for (; ; )
                {
                    // Write out some status information
                    if ((count % 1000) == 0)
                    {
                        Console.CursorLeft = 0;
                        Console.Write(new string(' ', 20));
                        Console.CursorLeft = 0;
                        Console.Write("{0}", count);
                        Console.CursorLeft = 0;
                    }

                    // Force a GC if necessaryry
                    if (alwaysGC) GC.Collect();

                    // Allocate a larger block if we're set up to do soso
                    if (allocateBigBlocks)
                    {
                        bigBlock = new byte[largeBlockSize];
                    }

                    // The next 'large' block will be just slightly largerer
                    if (grow) largeBlockSize++;

                    // Allocate a new block
                    smallBlocks.Add(new byte[blockSize]);
                    count++;
                }
            }
            catch (OutOfMemoryException)
            {
                // Force a GC, which should empty the LOH again
                bigBlock = null;
                GC.Collect();

                // Display the results for the amount of memory we managed to allocate
                Console.WriteLine("{0}: {1}Mb allocated"
                                 , (allocateBigBlocks ? "With large blocks" : "Only small blocks")
                                 + (alwaysGC ? ", frequent garbage collections" : "")
                                 + (grow ? "" : ", large blocks not growing")
                                 , (count * blockSize) / (1024 * 1024));
            }
        }

        static void Main(string[] args)
        {
            // Display results for cases both with and without the larger blocks
            Fill(true, true, false);
            Fill(true, true, true);
            Fill(false, true, false);
            Fill(true, false, false);

            Console.ReadLine();
        }
    }
}
Posted by Rüdiger Klaehn on 12/18/2009 at 3:15 AM
I just ran the provided example in Visual Studio 2010 beta 2. Here is the output:

With large blocks: 574Mb allocated
With large blocks, frequent garbage collections: 534Mb allocated
Only small blocks: 1676Mb allocated
With large blocks, large blocks not growing: 582Mb allocated

This is a 32bit program on a 64bit machine, so the amount of available memory is slightly less than 2GB. As you can see, with large object the LOH is so fragmented that the thing runs out of memory after less than 1/3 of the available memory is consumed. If you run this as a 64bit program the whole computer slows to a crawl due to excessive paging.