The object-relational mapper for .NET

How to

This page is converted from the old nhforge.org Wiki. First published by: Ken Egozi on 09-12-2008, Last revision by: John Davidson on 09-07-2011

Get unique results from joined queries


A recurring issue that new NHibernate users keep asking about, is the fact that they sometimes get duplicate results from their queries

For example, take a look at the next model:

where Comment.Name refer to the name of the person who wrote the comment.

If you'd want to issue a query on Post that joins to Comment, say "all of the posts that were commented by 'Ken Egozi'", the next hql query should do the trick:

   select  p
   from    Post p
   join    p.Comments c
   where   c.Name = 'Ken Egozi'";
So, you go ahead and
   var hql = @"
      select  p
      from    Post p
      join    p.Comments c
      where   c.Name = 'Ken Egozi'";
   var posts = session.CreateQuery(hql)

Assuming the next data is in the DB:

Comment by 'Ken Egozi' Comment by 'Fabio Maulo' Comment by 'Ken Egozi' Post#2 Comment by 'Fabio Maulo' Post#3 Comment by 'Fabio Maulo' Comment by 'Ken Egozi'

you'd expect the list 'posts' to include two posts: Post#1 and Post#3.
However, if you'd look into posts, you'll see that it has three entries, two entries pointing to Post#1 and one entry pointing to Post#3 

Note that NHibernate is smart enough not to create two separate instances for Post#1, but rather a single instance will be created, and the two entries in the list will point to the same instance in memory.


The said hql query (or an equivalent Criteria) is translated to the joined SELECT clause

   SELECT  {p.*}
   FROM    Posts p
     JOIN  Comments c ON c.PostId = p.Id
   WHERE   c.[Name] = 'Ken Egozi'

where {p.*} is substituted by the list of fields needed to satisfy a Post object. Running this SQL query directly on the data detailed above, will result with three lines in the resultset.


There are three solutions to that in SQL world:

  1. The DISTINCT keyword, when added on the fields of Posts, will ensure that there will be no duplicate lines. there are a few problems with that. Some RDBMS systems will not allow DISTINCT over certain data types, like the blob types in SQL Server. And if you also need to retrieve the joined data, then the DISTINCT keyword will be useless.
  2. Using a subselect
       SELECT *
       FROM   Posts p
          SELECT 1
          FROM   Comments c
          WHERE  c.PostId = p.Id
            AND  c.[Name] = 'Ken Egozi'
  3. Remove duplicates in the client code.

The solutions in NHibernate's world, in accordance to the SQL solutions:

  1. Use the 'distinct' keyword.
    The problems that exists in the SQL world still apply, and it can only be used in HQL, not it Criteria API.
  2. Use subselects.
    The syntax in HQL:
       from   Post p
       where exists
          from Comment c
          where c.Post = p
          and c.Name = 'Ken Egozi'
    and in Criteria API:
     var commentsByKenEgozi = DetachedCriteria.For<comment>()
         .Add(Restrictions.Eq("Name", "Ken Egozi"))
         .Add(Restrictions.EqProperty("Post.Id", "p.Id"))
      var posts = session.CreateCriteria(typeof (Post), "p")
  3. Remove duplicates in client code.
    That's actually pretty easy with NHibernate, thanks to the result transformers. all you need to do is add .SetResultTransformer(new DistinctRootEntityResultTransformer()) to either your query, or criteria, and NHibernate will remove duplicate entries from the resulted list.
    so, using the same hql as in the first example, you can write
     var hql = @"
          select  p
          from    Post p
          join    p.Comments c
          where   c.Name = 'Ken Egozi'";
       var posts = session.CreateQuery(hql)
          .SetResultTransformer(new DistinctRootEntityResultTransformer())
© NHibernate Community 2024