My name
is
Jon Skeet

C# method return versus variable in a lambda

So I was fighting some LINQ (entity framework) lambda which I ultimately fixed by storing the result in a variable. Something like:

IQueryable<object> DoStuff() { return /* some linq query */; }

void Main()
{
    var result = DoStuff();
    // (_database is DbContext) == true

    // Works fine
    _database.Table.Where(x => result.Contains(x.Id));

    // Throws "LINQ to Entities does not recognize the method '[...] DoStuff()' method, and this method cannot be translated into a store expression."
    _database.Table.Where(x => DoStuff().Contains(x.Id));
}

I was able to solve my problem (by accident) by just storing the query in a separate variable but I am curious as to why that works? Is this a deferred execution thing? Before this, I always thought that the two where exchangeable (method vs intermediate variable usage) for reference types but obviously they are not.

Sure - it's a matter of understanding what happens to the lambda expression.

For LINQ to SQL, EF etc, it's converted into an expression tree - basically data that represents the code in the lambda expression. The LINQ provider then has to convert that data into SQL. How can you turn a call to DoStuff() into SQL? You can't, because DoStuff() doesn't exist within the database, and the provider doesn't know what the method does to try to emulate it within SQL itself. The very limited set of method calls that do work are effectively hard-coded into the LINQ provider, along with the translations into SQL.

In theory, a particular smart LINQ provider could spot the call to DoQuery, analyze the IL within it to work out what it does, and check whether it understands everything within it, and convert it to SQL. I don't know any LINQ provider which does this, however, and I'd be quite surprised to see one which can do it reliably including further method calls within DoStuff. It would be a mammoth amount of work, and you wouldn't want it to happen at execution time anyway. (It would be awesome to see as a proof of concept, of course...)

Now, even with LINQ to Objects you'd have different behaviour between your two pieces of code. In the working code, DoStuff() is executed once, then the result used for a comparison against each element in the sequence. In the failing code, you'd be calling DoStuff() once per element in the sequence. It would still work, but you could end up executing queries multiple times unnecessarily. Note that here the transformation of the lambda expression is different - the compiler will use a delegate conversion instead of an expression tree conversion.

In your particular case, there may be even more going on - because your method returns IQueryable<object>, it's highly likely that what's returned from the method won't even have executed as a query yet - but assuming it's a LINQ query against the same database with the same provider, the LINQ provider will "understand" the query returned by DoStuff() and work out one SQL query including both the query part in DoStufF() and the use of it in Main. Have a look at the logs of what SQL is being executed to validate this.

See more on this question at Stackoverflow