Category Archives: programming

NHibernate Linq query evaluation process

I like the Linq provider of NHibernate, but I have encountered weird behavior and had to dive into how does NHibernate creates the SQL queries from the linq queries and creation of Linq provider extensions. If you are only interested how to program NHibernate Linq function  (e.g. how to check if text column corresponds to some regular expression in database), this is not the post to read – read Michaels post instead. This one is about how does NHibernate turns a Linq query into a result of the query.

I will be mostly dealing with a select clause of Linq, but other are quite similar.

Problem

I had a query that checked for entity nullness before creating a derived object using some properties of null-checked entity:

from post in session.Query<Post>()
select new PostModel
{
  Body = post.Body,
  Blog = post.Blog != null ? new EntityReference
    {
      Id = post.Blog.Id,
      Name = post.Blog.Name
    } : null
}

However, the generated SQL had more fields than necessary, in fact, it had all fields of a Blog entity:

-- Cursive are the properties of Blog that are not needed
select
    post0_.Body as col_0_0_,
    blog1_.Id as col_1_0_,
    post0_.Blog as col_2_0_,
    blog1_.Name as col_3_0_,
    blog1_.Id as Id0_,
    blog1_.Name as Name0_,
    blog1_.Subtitle as Subtitle0_,
    blog1_.Created as Created0_ 
from
    Post post0_ 
left outer join
    Blog blog1_ 
        on post0_.Blog=blog1_.Id

It doesn’t look like a big problem, but that is because this is a demonstration. In reality I have been creating several references and for each entity the query required all its properties. The result was a query with 100+ columns (4 entities * 20-30 columns per entity) and quite slow execution.

I had no idea why that happened, because similar query without a null check worked fine and NHibernate generated SQL without unnecessary fields:

from post in session.Query<Post>()
select new PostModel
{
    Body = post.Body,
    Blog = new EntityReference
        {
            Id = post.Blog.Id,
            Name = post.Blog.Name
        }
}
-- 
select
    post0_.Body as col_0_0_,
    post0_.Blog as col_1_0_,
    blog1_.Name as col_2_0_ 
from
    Post post0_ 
left outer join
    Blog blog1_ 
        on post0_.Blog=blog1_.Id

Explanation

I had to dive into NHibernate source code, because Google turned up nothing and all books on NHibernate are either old (pre 3.0 =without Linq) or only go through some simple Linq queries. Official NHibernate documentation even mention how to create Linq extensions, much less how does the process works. Yay me.

Well, here is a short overview of how does NH turns a Linq query to a result:

1. Rewriting query to be more HQL friendly

We have a Linq expression (=Linq query) object supplied by user. It’s representations is identical to the representation in the code, which can be hard to process, so NHibernate will take the query and rewrites it a little to be more easily processed. Here is an example of how a query looks before and after rewriting:

/* Source query */
from Post post in value(NHibernate.Linq.NhQueryable[Nerula.Data.Post])
select new PostModel() {
  Body = [post].Body,
  Blog =
    IIF(([post].Blog != null),
    new EntityReference() 
    {
      Id = [post].Blog.Id,   
      Name = [post].Blog.Name    
    },   
    null)
}

/* Rewritten query */
from Post post in value(NHibernate.Linq.NhQueryable[Nerula.Data.Post]) 
from Blog _0 in [post].Blog 
select new PostModel() 
{
  Body = [post].Body,
  Blog = 
    IIF(([_0] != null),
      new EntityReference() 
      {
        Id = [post].Blog.Id, 
        Name = [_0].Name 
      }, 
      null)
}

 2. Find sub-expressions representable in HQL

The Linq query is represented as a tree of subexpression.

For example: new PostModel() {Body = [100001].Body, Blog = IIF(([100001] != null), new EntityReference() {Id = [100001].Blog.Id, Name = [100001].Name}, null)} is a Linq expression of type ExpressionType.MemberInit. It consists from several subexpressions:

  • new PostModel() is an expression of type ExpressionType.New
  • [100001].Body is an expression of type ExpressionType.MemberAccess
  • {IIF(([100001] != null), new EntityReference() {Id = [100001].Blog.Id, Name = [100001].Name}, null)} is an expression of type ExpressionType.Conditional. This expression has its own subexpressions.

It is kind of obvious that expressions form a tree.

NHibernate will recursively visit all linq sub-expressions and gets a list of all expression that can be evaluated in database using HQL select statement – e.g. registered methods, entities and its properties, but not unregistered function calls or constants (e.g. null, 1, “Hello World”, post.Body.Contains(“Hello”)).

Here is a screenshot of all sub-expressions from the query above. The NHibernate extracts following expressions (no method here, but it could be):

NHibernate Linq candidates

BTW here is the magic that allows user to add their own Linq provider extension happens.

3. Replace database subexpressions

Now we have a list of sub-expressions that can be queried directly in the database. We will take our Linq query and rewrite it a little more. NHibernate will go recursively through the Linq expression and rewrites HQL representable expressions with an item from an array. The reason is simple, once it performs the DB query, it will have the result in array and once it performs the Linq query, the result of original subexpression (e.g. post.Body.Contains(“Subexpression”)) will be already done by database and the result will be an item in an array (e.g. true or false). That is the whole point – doing stuff in database.

The select clause will now have stuff queriable in db replaced with an item in an array:

new PostModel() {
  Body = Convert(input[0]), 
  Blog = IIF((Convert(input[1]) != null), 
         new EntityReference() 
         {
           Id = Convert(input[2]), 
           Name = Convert(input[3])
         }, 
         null)
}

4. Perform database query

We have a list of sub-expressions we are interested from step 2, NHibernate will do its magic, query the database and gets result in an input array.

5. Evaluate Linq query

We have a rewritten Linq query from step 3 and we have data used in it from step 4. We can actually evaluate the query! NHibernate does it and returns result.

Back to the problem

Well, that is about it. Where is the problem? Why did I get all these extra columns in my query?

Easy, NHibernate doesn’t recognize operator == in a select clause and therefore it has to load whole entity Blog into a memory (=that is why it loads all properties) where it is compared with null.

I though about why does NHibernate behaves like this, but it seems reasonable after a while. What if the comparison was done between entity and some in-memory instance of entity? What about overloaded operator ==? Database would have no idea what to do, so we can’t have blanket == from NHibernate Linq provider.

Important: This is valid for the Linq select clasue, the where clause doesn’t work like that. The where clause works as you expect, unlike select.

  • If you use post.Blog != null in the Linq where clause, the NHibernate will correctly translate it to SQL  where post0_.Blog is not null.
  • If you use post.Blog == memoryBlog  you get where post0_.Blog=@p0.

Solution

Nope, we are not lost. We can create a custom Linq provider for checking null. The NHibernate will recognize our method EntityState.Exists as a something that can be performed in the database and replace load of a whole entity with an extra column in query:

public static class EntityState {
    public static bool Exists(EntityBase entity) {
        return !ReferenceEquals(entity, null);
    }
}

public class EntityStateGenerator : BaseHqlGeneratorForMethod
{
    public EntityStateGenerator()
    {
        SupportedMethods = new[] { 
            ReflectionHelper.GetMethod(() => EntityState.Exists(null)) 
        };
    }

    public override HqlTreeNode BuildHql(MethodInfo method, Expression targetObject,
        ReadOnlyCollection<Expression> arguments, HqlTreeBuilder treeBuilder, 
        IHqlExpressionVisitor visitor)
    {
        return treeBuilder.IsNotNull(visitor.Visit(arguments[0]).AsExpression());
    }
}

public class MyLinqToHqlGeneratorsRegistry : DefaultLinqToHqlGeneratorsRegistry
{
    public NerulaLinqToHqlGeneratorsRegistry()
    {
        var generator = new EntityStateGenerator();
        foreach (var method in generator.SupportedMethods)
        {
            RegisterGenerator(method, generator);
        }
    }
}

We also have to register the generator (for XML, the property name is linqtohql.generatorsregistry) or you can simple add it to the configuration

configuration.SetProperty(NHibernate.Cfg.Environment.LinqToHqlGeneratorsRegistry, 
  typeof(MyLinqToHqlGeneratorsRegistry).AssemblyQualifiedName);

Now, the query

from post in session.Query<Post>()
select new PostModel
{
    Body = post.Body,
    Blog = EntityState.Exists(post.Blog)
        ? new EntityReference
        {
            Id = post.Blog.Id,
            Name = post.Blog.Name
        }
        : null
}

Will result in the following SQL, because NHibernate will replace subexpression EntityState.Exists(post.Blog) with the result of SQL case statement per step 3,4 and 5:

select
    post0_.Body as col_0_0_,
    case 
        when blog1_.Id is not null then 1 
        else 0 
    end as col_1_0_,
    post0_.Blog as col_2_0_,
    blog1_.Name as col_3_0_ 
from
    Post post0_ 
left outer join
    Blog blog1_ 
        on post0_.Blog=blog1_.Id

The reason for weird query was quite simple, but pretty understandable. One extra column is OK with me. I just wish it was documented.

Using NHibernate readonly property accessor

Recently I needed to create a string combined from several columns in the database into a single field. More precisely, I had an User entity with a FirstName and Surname properties and I needed the FullName property for filling the data into a ViewModel. The entity itself has many properties and its hydration is quite slow.

public class User : EntityBase
{
    public virtual string FirstName {get;set;}
    public virtual string Surname {get;set;}
    public virtual string FullName
    {
        get { return string.Format("{0} {1}", FirstName, Surname); }
    }
    // many other properties
}

Querying a view model

I wanted to create a view model in single linq query. The query will get from database only the fields that are necessary, not whole entities (e.g. only Title and).

var viewModelQuery = 
    from post in session.Query<Post>()
    where post.Id == postId
    select new PostViewModel
    {
        Title = post.Title,
        Text = post.Body,
        CreatedBy = string.Format("{0} {1}", 
            post.CreatedBy.FirstName, post.CreatedBy.Surname)
    };

The corresponsing query looks like this – notice it is a single query that gets only what is needed, no unnecessary properties of the User or the Post are fetched. This is the reason why I really like the NHibernate Linq provider as opposed to Criteria API or QueryOver.

select
    post0_.Title as col_0_0_,
    post0_.Body as col_1_0_,
    user1_.first_name as col_2_0_,
    user1_.surname as col_3_0_ 
from
    Post post0_ 
left outer join
    User user1_ 
        on post0_.CreatedBy=user1_.Id 
where
    post0_.Id=@p0;

However, there is a flaw: The string.Format creation of the CreatedBy. I would like to use FullName property, but that is not possible, because it is not mapped and using CreatedBy = post.CreatedBy.FullName would throw a mapping exception.

Querying full name

We can get around that using a formula in the mapping that has same result as the property in the entity class:

<class name="User">
    <id name="Id">
      <generator class="increment" />
    </id>
    <property name="FirstName" column="first_name" />
    <property name="Surname" column="surname" />
    <property name="FullName" access="readonly" formula="(first_name || ' ' || surname)" />
</class>

Notice use of readonly access. It was implemented in the NH-1621. It is an accessor used for querying in the database and thanks to that we can use it directly in the query, the prerequisites are existing property with getter (no need for setter) and semantically same as the column in the database. In our case, the property in not mapped to a column, but to a formula, but result is same: We can use it directly in the query:

var viewModelQuery = 
    from post in session.Query<Post>()
    where post.Id == postId
    select new PostViewModel
    {
        Title = post.Title,
        Text = post.Body,
        CreatedBy = post.CreatedBy.FullName
    };
var viewModel = viewModelQuery.Single();
Assert.AreEqual("Post title", viewModel.Title);
Assert.AreEqual("Text of the post", viewModel.Text);
Assert.AreEqual("John Smith", viewModel.CreatedBy)

The SQL query uses formula instead of two separate columns and as before it does it without getting unnecessary properties:

select
    post0_.Title as col_0_0_,
    post0_.Body as col_1_0_,
    (user1_.first_name || ' ' || user1_.surname) as col_2_0_ 
from
    Post post0_ 
left outer join
    User user1_ 
        on post0_.CreatedBy=user1_.Id 
where
    post0_.Id=@p0

Now we can safely add a middle name or reverse order of full name simply by modifying a formula in the mapping file and the getter property and the change will appear everywhere as opposed to modifying the string.Format method.

Circular Dependency in Unity

I have been toying around with an idea of replacing the ObjectBuilder in the WCSF with a proper dependency injector. The Unity seemed like the obvious choice (someone even tried porting WCSF to Unity) so I have been reading Dependency Injection with Unity.

During my research I have stumbled upon the ugly side of the Unity: It can’t detect circular dependencies. I though that it is only true in some old version, it would be hell trying to find the circular dependency only with StackOverflowException so I tested it out. After all “One good test is worth a thousand expert opinions”:

public interface IA {}
public interface IB {}

public class A : IA
{
  public A(IB ib) {}
}

public class B : IB
{
  public B(IA ia) {}
}

[TestMethod]
public void UnityContainer()
{
  using (var container = new UnityContainer())
  {
    container.RegisterType<IA, A>();
    container.RegisterType<IB, B>();
    container.Resolve<IA>();
  }
}

It crashed, hard. The test didn’t pass, nor fail. I just got

—— Run test started ——
The active Test Run was aborted because the execution process exited unexpectedly. To investigate further, enable local crash dumps either at the machine level or for process vstest.executionengine.x86.exe. Go to more details: http://go.microsoft.com/fwlink/?linkid=232477
========== Run test finished: 0 run (0:00:05,2708735) ==========

Unity_CircularDependency_StackOverflowExceptionDuring debugging the test, I got the dreaded StackOverflowException, so there it is: No detection of circular dependencies in the Unity and that is the reason why I won’t be using it. There are other fishes in the barrel.

I have tried Ninject and Castle Windsor, thankfully, both detect circular dependencies and throw exceptions with meaningful messages. Ninject has this error message:

Ninject.ActivationException: Error activating IA using binding from IA to A
A cyclical dependency was detected between the constructors of two services.
Activation path:
3) Injection of dependency IA into parameter ia of constructor of type B
2) Injection of dependency IB into parameter ib of constructor of type A
1) Request for IA

Suggestions:
1) Ensure that you have not declared a dependency for IA on any implementations of the service.
2) Consider combining the services into a single one to remove the cycle.
3) Use property injection instead of constructor injection, and implement IInitializable
if you need initialization logic to be run after property values have been injected.

While Castle exception has this message:

Castle.MicroKernel.CircularDependencyException: Dependency cycle has been detected when trying to resolve component ‘UnityTest.A’.
The resolution tree that resulted in the cycle is the following:
Component ‘UnityTest.A’ resolved as dependency of
component ‘UnityTest.B’ resolved as dependency of
component ‘UnityTest.A’ which is the root component being resolved.

I am not impressed with the simplicity of the Castle nor with documentation of the Ninject, but I don’t want a nightmare of circular dependencies without meaningful error message in my project.

Chained web.config transformation

ASP.NET apps have a global configuration file called Web.config and because apps are published in several configurations (e.g. Debug and Release), there is a simple way to transform the web.config using transformation files instead of keeping two nearly identical web.config files at once. It is a great feature (it also works for App.config).

It works like this: you have a base Web.config and transformation files named Web.$(configuration).config (e.g. Web.Release.config) that transform the original Web.config (e.g. specifying smtp server) during deploy or publish.

The transformation files make it very easy to change Web.config, e.g. following snippet adds a key to the appSettings section.

<?xml version="1.0" encoding="utf-8"?>
<!-- For more information on using web.config transformation 
visit http://go.microsoft.com/fwlink/?LinkId=125889 -->
<configuration xmlns:xdt="http://schemas.microsoft.com/XML-Document-Transform">
  <appSettings>
    <add key="Version" value="3.15" xdt:Transform="Insert"/>
  </appSettings>
</configuration>

You can look up info about possible transformations in the official documentation or for quick review go to the Scott Hanselman blog.

This allows one transformation, but you can have two chained transformations if you use publish profiles.

It is simple to do, in the context menu of the profile click on the Add Config Transform and Visual Studio will create a new transformation file that is applied after the build configuration transformation.

add-config-transform

In the following image you can see the result of staged transformation (Web.Debug.config and then Web.Integration.config, see red rectangle). You can see the diff between the original Web.config and transformed one using the “Preview Transform” item of the Web.$(configuration).config context menu.

 

web.config-chained-transformation

I thought it would be great, we have 5 different environments (local, CI, INT, ACC, PROD) and for each one two web.configs (i.e. 10 configs in total) that have to be kept in sync. It is a lot of rather error prone work. While testing it out, I encountered following problems:

Web.config is transformed only on publish, not build

We use Visual Studio to program and debug our application, so we simply choose Debug -> Start without debugging. That is troublesome, because the app will use the original, untransformed Web.config, because transformation is done only on Publish or Deploy and the resulting Web.config is stored somewhere else.

I often debug app, change Web.config and the IIS automatically detects that Web.config has changed and reloads the site. When I configured logging, I changed web.config a lot of time.

I am of course not the first person to encounter this, so there are some solutions. The *.csproj project file is only a MSBuild script, it can be modified.

  • Rename the base Web.config to Web.generic.config
  • Open your *.csproj file in text editor
  • Uncomment  <Target Name="AfterBuild"> target in the *.csproj. It is commented out along with   <Target Name="BeforeBuild"> target.
  • Add the TransformXml for Web.Config into the target
    <Target Name="AfterBuild">
      <TransformXml Source="Web.Generic.Config"
                    Transform="$(ProjectConfigTransformFileName)"
                    Destination="Web.Config" />  
    </Target>
    

Now, every time the project is build, the Web.config will be changed. Note that this doesn’t support chained transformation, so only build configuration is applied.

For details see the Making Visual Studio 2010 Web.config Transformations Apply on Every Build. You can also look at this forum thread that provides some other suggestions.

Sensitive information must not be in transformation file

Some information, e.g. password to our production database is not available to developers, so it can’t be in the transformation files, yet I want simple and reliable deploy to production I (as a developer) can do without our PM present. I am hoping to use a Web Deploy Parameters to do that.

References

 

 

 

ObjectBuilder

ObjectBuilder is a C# dependency injection framework, more precisely it is a framework for building a dependency injectors. There were two versions, the Object Builder 2 was later integrated into Unity Application Block.

The earlier versions is used by Web Client Software Factory (WCSF), the library for building web applications in WebForms. It is tool we are using for our internal system.

The first thing I notices about Object Builder is that it is woefully undocumented (official MSDN documentation), the source code is available, but no quick start or anything. I have googled a little and found very helpful post about how to actually use it to create objects. – sort of quick start tutorial.

Good, but not enough, I am trying to transfer out C# app from Web Site Project in Web Forms and WCSF to to MVC, but since WCSF auto-magically inserts all dependencies, I had to dive into source code of OB in order to understand it and later integrate it for MVC controller creation (I want to insert already existing services and other stuff into MVC controllers).

First, it is good to have an idea what functionality of ObjectBuilder do I have to replicate:

  • Services – Singletons that are alive during the whole life of web app. When constructor of object or property of object has a attribute [ServiceDependency], the OB will put there a singleton instance of some object.
  • New objects – whenever asked (through attribute [CreateNew]), the OB creates a new object.
  • There is also some registration of services and type mappings, but I am only interested in how to create a new object / get service just like the WCSF would.

Since the OB is not a dependency injection framework, but rather framework for building DI, it has only very simple DI framework – the one that can either create a new object or use singleton singleton instance.

The WCSF can use DI in two ways, either through constructor or by filling proprties (see official documentation):

public class MyClass {
  public MyClass(
    [CreateNew] IStoreInventory storeInventory,
    [ServiceDependency] ITimeService timeService)
  {
    // ... object initialization
  }
  
  [ServiceDependency]
  public IMyService MyService {get;set;}
}

Core concepts of the ObjectBuilder:

  • IBuilder – The builder that builds up or tears own the objects. The BuilderBase class is easy to understand. You can create a object like this
builder.BuildUp<MyClass>(locator, idToBuild, existing)
// In reality all parameters can be null when not used

The builder has a chain of building strategies (the various ways to create an object).

  • IBuilderStrategy – Strategy how will the objects be build.

The strategies can be varied, e.g. singleton strategy can look through locator and if there already is an object, return it; if the object is not in locator, create it, add to locator and return it. Or the the object can be build by some factory method.

  • IBuilderPolicy – Policy tailoring concrete implementations of IBuilderStartegy, e.g. IMethodPolicy can tailor which method will be called by MethodExecutionStrategy.
  • IBuilderContext – context used for one request to build up an object. It consists from locator, chain of building strategies and list of policies. It basically only holds tailored data passed to the chain of strategies of IBuilder.
  • ILocator – Basically a dictionary of id-object, it is used mostly for singletons, so when someone asks for an object with specified id, the OB will use locator to locate it.

When we request an object from IBuilder, it

  1. takes passed paramaters,
  2. creates a new IBuilderContext from passed parameters and other internal data (strategies, policies)
  3. IBuilder asks the head of builder strategy chain to build up an object.
  4. The IBuilderStartegy will look at the IBuilderContext and other data and determines if it can build up an object. If it can, it returns an object. If it can’t, it asks the next strategy in builder chain to try to build up the object. Note that each link of chain must call previous link.
  5. The base implementation BuilderStartegy of IBuilderStartegy will return the existing object (the one passed as parameter into IBuilder.BuildUp), if all strategies in chain fail.

This is rather high level description of OB, concrete example how to set it up, look at the David Hayden blog.