Why is garbage collector allowed to collect seemingly referenced objects with a finalizer?

This question is basically why we need GC.KeepAlive() in the first place.

Here's where we need it. We have a wrapper for some unmanaged resource

public class CoolWrapper
{
     public CoolWrapper()
     {
         coolResourceHandle = UnmanagedWinApiCode.CreateCoolResource();
         if (coolResourceHandle == IntPtr.Zero)
         {
             // something went wrong, throw exception
         }
     }

     ~CoolWrapper()
     {
         UnmanagedWinApiCode.DestroyCoolResource(coolResource);
     }

     public void DoSomething()
     {
         var result = UnmanagedWinApiCode.DoSomething(coolResource);
         if (result == 0)
         {
             // something went wrong, throw exception
         }
     }

     private IntPtr coolResourceHandle;
}

and our code uses that wrapper:

var wrapper = CoolWrapper();
wrapper.DoSomething();

and if this code is run in Release configuration and not under debugger then it may so happen that code optimizer sees that the reference is not actually used after this code and also that coolResourceHandle member variable is not accessed (by managed code) after it was read inside DoSomething() and its value was passed into unmanaged code and so the following happens:

  • DoSomething() is called
  • coolResourceHandle is read
  • garbage collection suddenly starts
  • ~CoolWrapper() runs
  • UnmanagedWinApiCode.DestroyCoolResource() runs and the resource is destroyed, the resource handle is invalidated
  • UnmanagedWinApiCode.DoSomething() runs using the value which now refers to a non-existing object (or maybe another object is created and assigned that handle)

The situation described above is actually possible and it's a race between a method of the object and a running garbage collection. No matter there's a local variable of reference type on stack - optimized code ignores that reference and the object becomes eligible for garbage collection immediately after coolResourceHandle was read inside DoSomething().

So, to prevent this we use GC.KeepAlive():

var wrapper = CoolWrapper();
wrapper.DoSomething();
GC.KeepAlive(wrapper);

which makes the object non-eligible for GC until GC.KeepAlive() is invoked.

This of course requires that all users use GC.KeepAlive() everywhere which they will forget, so the right place is CoolWrapper.DoSomething():

 public void DoSomething()
 {
     var result = UnmanagedWinApiCode.DoSomething(coolResource);
     GC.KeepAlive(this);
     if (result == 0)
     {
         // something went wrong, throw exception
     }
 }

and this basically prevents the objects from getting eligible for GC while there's a method of this object running.

Why is this needed? Why wouldn't GC ignore the objects which have a method running at that moment and also have a finalizer? This would make life much easier yet we need to use GC.KeepAlive() instead.

Why is such aggressive collection allowed instead of ignoring objects which have methods currently running and a finalizer (and so likely to have problems in case there's a race as described above)?

Jon Skeet
people
quotationmark

Why is this needed? Why wouldn't GC ignore the objects which have a method running at that moment and also have a finalizer?

Because that's not what the GC (or the C# specification) guarantees. The guarantee is that if an object won't be finalized or collected while it's still possible to read a field from it. If the JIT/GC detects that although you're currently executing an instance method, there's no execution path whereby that method will read any more fields, it is legal for the object to be collected (assuming there's nothing else keeping it alive).

It's surprising, but that's the rule - and I strongly suspect that the reason for it is to allow optimization paths that would otherwise be impossible.

Your fix of using GC.KeepAlive is a perfectly reasonable one. Note that the number of situations where this is relevant is pretty tiny.

people

See more on this question at Stackoverflow