23 September 2013

Memory leaks in .Net applications. Yes – they happen all the time…

Anybody developing with C or C++ will be only too familiar with the insidious effects of memory leaks. Simply put, a memory leak occurs when an application doesn’t free up memory that it has finished using. Memory leaks are bugs with unpleasant, slow-burning consequences that can take some time to manifest themselves.

Framework-based development using languages such as C# and java is supposed to take care of this by abstracting memory management away from the developer. However, this abstraction does not make applications immune to memory leaks. The promise of garbage collection and abstracted memory management does lull many .Net developers into a false sense of security.

For example, the.Net garbage collector is supposed to take care of memory management but it only releases memory that is unreachable. If an application still has a reference to an object then it won’t be released. Therefore, there’s still plenty of scope for writing a leaky application in the .Net framework. Developers do need to be aware of what’s going on under the hood as there are a number of common traps for the unwary.

Static references

Objects that are referenced by static fields are never released. This may seem straightforward, but it also applies to any objects that are indirectly referred to by a static field. If you don’t pay attention to indirect references then you may get an ever-increasing chain of object references building up. They will hang around for ever because the root reference at the start of the chain is static.

Events and “lapsed listeners”

The .Net event model is based on creating subscriptions to events, much like the code below:

obj.Changed += new ChangedEventHandler(sender);

This causes the event publisher to maintain a reference to the subscriber. This becomes a leak if the subscriber fails to explicitly unsubscribe from the publisher. The subscriber object cannot be released and the publisher is left broadcasting to a “zombie” object that is no longer required.

One solution to this problem is using a weak reference that allows the garbage collector to release the object while the application reference still exists. However, Microsoft recommend against using this as a default solution to memory management problems. Ultimately, more robust management of event subscriptions is the preferred approach to avoiding leaks through lapsed listeners.

Incorrect or missing clean-up code

The garbage collector doesn’t free .Net developers from the responsibility of cleaning up after themselves. This is what the IDisposable interface is for. If it’s been implemented then it’s a signal that there’s something that needs to be cleaned up and it should always be called on completion.

This is particularly the case for resources that run in the unmanaged heap where the garbage collector is not responsible for clean-up. Any resources that run outside of the .Net framework such as database connections, file handles and COM objects are unmanaged so must be disposed of explicitly.

It’s surprising how often developers forget to do this but it’s probably the single most important responsibility developers have in resource management. Any application that uses unmanaged resources will drown at scale if the clean-up code isn’t being called properly.

Dead threads

Poor thread management can also be a source of leaks. The garbage collector is not responsible for cleaning up the stack as any space reserved is automatically cleaned up when methods return. However, you can cause Stack leaks through poor thread management. If an application creates worker threads for background tasks and does not terminate them properly then the memory that is used by the thread is never released.

The code below is a contrived example of this. Each time you press “ENTER” a new thread is spawned that attempts a Join operation on itself.

    class Program
    {
        static void Main()
        {
            while (true)
            {
                Console.WriteLine("Press <ENTER> to start a thread...");
                Console.ReadLine();
                Thread t = new Thread(new ThreadStart(ThreadProc));
                t.Start();
            }
        }

        static void ThreadProc()
        {
            Console.WriteLine("Thread #{0} started..."Thread.CurrentThread.ManagedThreadId);
            // Block the thread
            Thread.CurrentThread.Join();
        }
    }

With each iteration of the loop the Thread object reference is dropped but the actual managed thread remains. If you track the memory usage for the application you’ll see it slowly moving up with each new thread as memory on the Stack fails to get released.

Large object fragmentation

.Net manages memory in two main areas knows as heaps. One is reserved for small objects and is regularly compacted to ensure optimal memory usage. If an object is 85k or larger then it is allocated onto the “large” object heap. This part of managed memory is not compacted automatically due to the cost of moving larger objects. The framework just adds objects to this memory wherever it finds some free space.

As these larger objects are allocated and freed, gaps inevitably start to appear in the large object heap. Memory use becomes much less efficient and under certain circumstances this can cause “out of memory” exceptions as the heap becomes hopelessly fragmented.

This fragmentation depends very much on your usage pattern but the more extreme manifestations tend to be associated with long-running processes that make frequent updates to large memory objects. Recent additions to the .Net framework allow you to initiate a clean-up of large heap fragmentation but it still remains the responsibility of the developer to be aware of memory allocation for large objects.

Poor resource management

Ultimately you have finite resources so you should get used to the discipline of only using memory when you have to. If you hold onto a variable unnecessarily then you are preventing the garbage collector from freeing up memory as quickly as possible. It’s one more item that has to be managed through the heap. This may seem insignificant but it can start to make a difference if you hold on to too much unnecessary data through long-running operations.

Poorly designed applications inevitably have larger memory footprints than is strictly necessary. State buckets such as session caches encourage developers to create unnecessary and long-lived memory dumps. Over-aggressive caching can start to resemble to symptoms of a memory leak. Although the details of memory management may be largely hidden from developers, they still have a responsibility to protect finite memory resources and code as efficiently as possible.

Filed under Architecture, C#, Net Framework.