Friday 29 January 2010

Some Linq definitions

Sure Linq is nothing new at this point in time, but still today I think there are some misconceptions, terms wrongly used... and I think a short recap can be useful (at least as an easily accessible note for myself)

Lambdas:
not exclusive to Linq, but highly related.
A Lambda is just an abbreviated way to create an anonymous delegate or an Expression<TDelegate>. It makes heavy use of the type inference feature added in C# 3.0.
Lambdas have three parts

  • the input parameters on the left

  • the => operator (that reads like "goes to")

  • the expression or statement on the right


Based on this third part we have 2 types of Lambdas:
Expression Lambdas and Statement Lambdas.

  • Statement Lambdas have brackets that delimit a "statement block". This block can contain any C# code, so this type of lambdas can be used instead of any delegate. It's just syntax sugar that the compiler turns into a delegate.

    sample - you can check this with our good friend Reflector :-)

    c =>{ Console.WriteLine("called");
    return c.StartsWith("L");
    }

    is equivalent to this:

    delegate (string c) {
    Console.WriteLine("called");
    return c.StartsWith("L");
    }



  • Expression Lambdas contain just one expression. This is the kind of lambdas more commonly used. Again this is just syntax sugar, and depending on where they're being used, compiler can turn them into a delegate or an Expression<TDelegate> (an expression tree).

    When we have this code (the Where here is the System.Linq.Queryable.Where extension method, that expects an Expression<TDelegate>):

    MyDataContext dc = new MyDataContext();
    dc.Contacts.Where(contact => contact.FirstName.StartsWith("Greg"));

    this is translated into this:

    contacts = dc.Contacts.Where(Expression.Lambda<Func<Contact, bool>>(Expression.Call(Expression.Property(CS$0$0000 = Expression.Parameter(typeof(Contact), "c"), (MethodInfo) methodof(Contact.get_FirstName)), (MethodInfo) methodof(string.StartsWith), new Expression[] { Expression.Constant("Greg", typeof(string)) }), new ParameterExpression[] { CS$0$0000 }));


    however, if we have this code (the Where here is the System.Linq.Enumerable.Where extension method, that expects a Func<T, bool> delegate):

    List cities = new List(){"Berlin", "London", "Xixon", "Llangreu"};
    results = cities.Where(c => c.StartsWith("L"));


    it gets translated into something like this:

    results = cities.Where(delegate(string c){
    return c.StartsWith("L");
    });


    Of course Expression<TDelegate> (Expression Tree) are terribly important for IQueryable Linq Providers (like Linq to Sql). An Expression Tree is just an AST that gets turned into code (for example TSql when that IQueryable Provider is Linq To Sql) at runtime. It's easy to see that this is a rather complex thing and that's why only expression lambdas can be turned into Expression<TDelegate> (I'm talking about .Net 3.5, fortunately that will be different with the advent of .Net 4.0 and DLR)


Given an Expression<TDelegate> we can turn it into a TDelegate just by calling its Compile methods (this is really beautiful, at compile time we just create an AST and it's at runtime when it gets turned into CIL). Nevertheless we can't convert a "good candidate" delegate (I mean, one which body is just one expression) into an Expression<TDelegate>. There's not a "magic" method to do that at runtime, it's just "compiler magic" what turns a lambda expression into an Expression<TDelegate>. To do this at runtime we would need to get the CIL for that Delegate (that's easy: myDelegate.Method.GetMethodBody()) and then translate it into an Expression Tree (that's not easy...) I've done some googling and haven't found any implementation of this.

Linq
When writing "Linq code" we have two ways (two syntaxes) to write our code:

  • Query Syntax (Query Expressions)

  • Method Syntax (Standard Query Operators Method Calls)


The first one is just syntax sugar that gets translated by the compiler into method calls. That means that this:

//Query Syntax
IEnumerable<string> results = from c in cities
where c.StartsWith("L")
select c;

and this:

//Method Syntax
IEnumerable<string> result = cities.Where(c => c.StartsWith("L"));

are just the same.
At a compiler level I'm not sure if we have a two pass thing or not. I mean, maybe the compiler does some preprocessing turns the first code block into the second one, and then the CS to CIL compilation is done, or maybe the first code block is directly compiled to CIL.


No comments:

Post a Comment