The best explanation in expression trees I've ever read is this article by Charlie Calvert.
Summarizing:
The expression tree represents what , not how , which you want to do.
Consider the following very simple lambda expression:
Func<int, int, int> function = (a, b) => a + b;
This operator consists of three sections:
- Declaration:
Func<int, int, int> function - Equality Operator:
= - Lambda expression:
(a, b) => a + b;
The function variable points to raw executable code that knows how to add two numbers .
This is the most important difference between delegates and expressions. You call function , not knowing what it will do with the two integers that you passed. It takes two and returns one that your code may know.
In the previous section, you saw how to declare a variable pointing to raw executable code. Expression trees are not executable code ; they are a form of data structure.
Now, unlike delegates, your code may know what the expression tree should do.
LINQ provides a simple syntax for translating code into a data structure called an expression tree. The first step is to add a using statement to represent the Linq.Expressions namespace:
using System.Linq.Expressions;
Now we can create an expression tree:
Expression<Func<int, int, int>> expression = (a, b) => a + b;
The identical lambda expression shown in the previous example is converted to an expression tree declared as type Expression<T> . The expression identifier is not executable code; This is a data structure called an expression tree.
This means that you cannot just invoke the expression tree, just as you could invoke a delegate, but you can parse it. So what can your code understand by parsing the expression variable?
// `expression.NodeType` returns NodeType.Lambda. // `expression.Type` returns Func<int, int, int>. // `expression.ReturnType` returns Int32. var body = expression.Body; // `body.NodeType` returns ExpressionType.Add. // `body.Type` returns System.Int32. var parameters = expression.Parameters; // `parameters.Count` returns 2. var firstParam = parameters[0]; // `firstParam.Name` returns "a". // `firstParam.Type` returns System.Int32. var secondParam = parameters[1]. // `secondParam.Name` returns "b". // `secondParam.Type` returns System.Int32.
Here we see that there is a lot of information that we can get from the expression.
But why do we need this?
You have learned that an expression tree is a data structure representing executable code. But so far we have not answered the central question of why such a transformation would have to be done. This is the question we asked at the beginning of this post, and it is time to answer it.
LINQ to SQL query does not execute inside your C # program. Instead, it is converted to SQL, sent by wiring, and executed on the database server. In other words, the following code is never executed inside your program:
var query = from c in db.Customers where c.City == "Nantes" select new { c.City, c.CompanyName };
It is first converted to the following SQL statement, and then executed on the server:
SELECT [t0].[City], [t0].[CompanyName] FROM [dbo].[Customers] AS [t0] WHERE [t0].[City] = @p0
The code found in the query expression must be translated into an SQL query, which can be sent to another process as a string. In this case, this process is the SQL server database. Obviously, it will be much easier to translate a data structure such as an expression tree into SQL than to convert raw IL or executable code to SQL. To exaggerate the complexity of the problem, just imagine that you are trying to translate a series of zeros and ones into SQL!
When it is time to translate the query expression into SQL, the expression tree representing your query will be parsed and parsed in the same way as we divided our simple lambda expression tree in the previous section. Of course, the LINQ to SQL expression tree analysis algorithm is much more complicated than the one we used, but the principle is the same. After he has analyzed parts of the expression tree, LINQ ponders them and decides the best way to write an SQL statement that will return the requested data.
Expression trees were created so that the task of converting code, such as a query expression, into a string that can be passed to another process and executed there. It is so simple. There is no great secret, there is no magic wand to be waved. One simply takes the code, converts it into data, and then analyzes the data to find the components that will be translated into a string that can be transferred to another process.
Since the request comes to the compiler encapsulated in such an abstract data structure, the compiler can freely interpret it in almost any way. He is not forced to fulfill the request in a specific order or in a certain way. Instead, it can parse the expression tree, discover what you want to do, and then decide how to do it. At least theoretically, he has the right to consider any number of factors, such as current network traffic, database load, current results that he has, etc. In practice, LINQ to SQL does not take all these factors into account, but it is theoretically free to do what it wants. In addition, you can transfer this expression tree to some custom code that you write manually that could parse it and translate it into something very different from what LINQ to SQL creates.
And again, we see that expression trees allow us to represent (express?) What we want to do. And we use translators who decide how our expressions are performed.