My name
is
Jon Skeet

Why does a lambda expression in C# cause a memory leak?

Note: this is not just some random useless code, this is an attempt to reproduce an issue with lambda expressions and memory leaks in C#.

Examine the following program in C#. It's a console application that simply:

Creates a new object of type Test
Writes to the console that the object was created
Calls garbage collection
Wait for any user input
Shuts down

I run this program using JetBrains DotMemory, and I take two memory snapshots: one after the object was initialized, and another after its been collected. I compare the snapshots and get what I expect: one dead object of type Test.

But here's the quandary: I then create a local lambda expression inside the object's constructor and I DO NOT USE IT ANYWHERE. It's just a local constructor variable. I run the same procedure in DotMemory, and suddenly, I get an object of type Test+<>, which survives garbage collection.

See the attached retention path report from DotMemory: The lambda expression has a pointer to the Test+<> object, which is expected. But who has a pointer to the lambda expression, and why is it kept in memory?

Also, this Test+<> object - I assume it is just temporary object to hold the lambda method, and has nothing to do with the original Test object, am I right?

public class Test
{
    public Test()
    {
        // this line causes a leak
        Func<object, bool> t = _ => true;
    }

    public void WriteFirstLine()
    {
        Console.WriteLine("Object allocated...");
    }

    public void WriteSecondLine()
    {
        Console.WriteLine("Object deallocated. Press any button to exit.");
    }
}

class Program
{
    static void Main(string[] args)
    {
        var t = new Test();
        t.WriteFirstLine();
        Console.ReadLine();
        t.WriteSecondLine();
        GC.Collect();
        GC.WaitForPendingFinalizers();
        GC.Collect();

        Console.ReadLine();
    }
}

I suspect that what you're seeing is the effect of a compiler optimization.

Suppose Test() is called multiple times. The compiler could create a new delegate each time - but that seems a little wasteful. The lambda expression doesn't capture either this or any local variables or parameters, so a single delegate instance can be reused for all invocations of Test(). The compiler emits code to create the delegate lazily, but store it in a static field. So it's like this:

private static Func<object, bool> cachedT;

public Test()
{
    if (cachedT == null)
    {
        cachedT = _ => true;
    }
    Func<object, bool> t = cachedT;
}

Now that does create an object that will never be garbage collected, but it reduces GC pressure if Test is called frequently. The compiler can't really know which is likely to be better, unfortunately.

This is detectable with reference equality by looking at the delegates resulting from the lambda expression. For example, this prints True (at least for me; it's a compiler implementation detail):

using System;

class Test
{
    private Func<object> CreateFunc()
    {
        return () => new object();
    }

    static void Main()
    {
        Test t = new Test();
        var f1 = t.CreateFunc();
        var f2 = t.CreateFunc();
        Console.WriteLine(ReferenceEquals(f1, f2));
    }
}

But if you change the lambda expression to () => this; it prints False.

See more on this question at Stackoverflow