Logo

NHibernate

The object-relational mapper for .NET

NHibernate POID Generators revealed

(Disclaimer: This post will be more or less a paraphrase of Fabio Maulo’s post, and I hope I can improve it a bit)

This topic is something that I wanted to write because I wasn’t aware of the drawbacks of “ native/identityimage” generator has until Fabio told me. Now it is my turn to spread the information to those who aren’t aware too. I even made a small poll via twitter, to see who uses what, and the result turns out to be that majority of people use identity/native for some reasons.

NHibernate has several object identifier generators for entities. Each of them has their cons and pros as anything else does.

We can basically seperate generators into two: PostInsertGenerator and ORM Style generators ( you can also call them identity style vs ORM stlye generators). impact on your  I will investigate them in their categories.

ORM Style Generators 

ORM style generator can generate the identifiers before objects are sent to database. This is advantageous because you don’t need to go to database in order to have the ID, then set a relation based on this id. It also promotes Unit-Of-Work since you don’t need to go to database everytime an object is added/updated instead you do those at the moment of commit. Those generators are what WE SUGGEST.

Currently NHibernate provides several ORM style generators, some of them are listed below.

  • Guid
    Generates id’s by calling Guid.NewGuid(). Main drawback of this is with indexes. We know that Guids are more or less random(or pseudo-random let’s say) and this randomness creates fragmentation in database index. If you also think that the field is a PK, then it becomes more dramatic since they are stored in sorted manner.
  • Guid.Comb
    A very clever improvement over the Guid way. It creates guid based on the system time, and the guid it creates is database friendly. It doesn’t cause fragmentation in the table. You can see it from here

    I wonder if anybody reads the ALT of images? 
    (Image taken from Pamir Erdem’s blog)
    The effect of SequentialNewId() for default value has more or less the same effect of Guid.Comb
  • HiLo/Sequence HiLo
    This one is the one I like the most. It is both index friendly and user friendly. A HiLo id has 2 parts as the name suggests they are Hi and Lo. Each session factory gets the Hi value from database (with locking enabled), and lo values are managed by the session factory on its own. This algorithm also scales really well. All factories gets the Hi value only once. This reduces the the database traffic that aims to get the Hi values.

Post Insert Generators / Identity Style Generators

Post insert generators, as the name suggest, assigns the id’s after the entity is stored in the database. A select statement is executed against database. They have many drawbacks, and in my opinion they must be used only on brownfield projects. Those generators are what WE DO NOT SUGGEST as NH Team.

Some of the drawbacks are the following

  1. Unit Of Work is broken with the use of those strategies. It doesn’t matter if you’re using FlushMode.Commit, each Save results in an insert statement against DB. As a best practice, we should defer insertions to the commit, but using a post insert generator makes it commit on save (which is what UoW doesn’t do).
  2. Those strategies nullify batcher, you can’t take the advantage of sending multiple queries at once(as it must go to database at the time of Save)

There are several Post Insert Generator strategies (hey 2.1 has even more!) some of which are listed below(there are many, check Fabio’s post here)

  1. Identity
    Identity generator uses the value that is generated by MsSQL "identity” stuff. However, it’s meaning in the mapping changes depending on the dialect. For example, if database supports MsSQL like identity, then it will be used, if it supports sequences, then sequences will be used, etc. Something I learnt today from the NHUsers group is that MSSQL may sometimes return invalid SCOPE_IDENTITY() value.
  2. Guid.Native
    If I am to speak in terms of MsSQL terminology, it uses the NEWID() function to get a uniqueidentifier.

Comparison

I hear you say “you speak too much, all those doesn’t tell much, show me the code!” There it is, the comparison of post insert generators vs ORM style generators.

I will first start with demonstrating how they break UoW, then continue with Batcher! (did you know that NH uses NonBatchingBatcher by default? ;) )

The code under test is simple

[Test]
public void Should_not_insert_entity_in_a_transaction_HiLo()
{
var post = new PostWithHiLo {Title = "Identity Generators Revealed"};
var postComment = new PostCommentWithHiLo { Post = post, Comment = "Comment" };
using (ISession session = factory.OpenSession())
using (var tran = session.BeginTransaction())
{
session.Save(post); //No commit here
session.Save(postComment);
long insertCount = factory.Statistics.EntityInsertCount;
Assert.That(insertCount, Is.EqualTo(0), "Shouldn't insert entity in a transaction before commit.");
}
}

[Test]
public void Should_not_insert_entity_in_a_transaction_Identity()
{
var post = new PostWithIdentity {Title = "Identity Generators Revealed"};
var postComment = new PostCommentWithIdentity {Post = post, Comment = "Comment"};
using (ISession session = factory.OpenSession())
using (var tran = session.BeginTransaction())
{
session.Save(post);
session.Save(postComment);
long insertCount = factory.Statistics.EntityInsertCount;
Assert.That(insertCount, Is.EqualTo(0), "Shouldn't insert entity in a transaction before commit.");
}
}

Now, let’s try it. What do you expect in both cases? Should both test pass? The test with identity strategy fails as it tries to insert the entity even before calling a commit.

Now here is the explanation for the batcher:

using (ISession session = factory.OpenSession())
using (var tran = session.BeginTransaction())
{
for (int i = 0; i < 3; i++)
{
var post = new PostWithHiLo {Title = string.Format("Identity Generators Revealed {0}", i)};
session.Save(post);
}
tran.Commit();
}

The upper code sends queries to database only once. However, if you’re using the Identity style generators, then you’re in trouble.

Conclusion

You should know what you’re gaining and what you’re losing when using an identifier strategies. In case of a greenfield application, my choice would be to use HiLo as it is more user friendly(and this is what NH team suggests actually), and Guid.Comb in case a replication kinda thing is required. Most probably I wouldn’t use Identity. However, on a brownfield application, where you can’t really change the DB schema for some reason, than Identity should be used as a last resort.

I’d like to end this post with two sayings that I hear/see from Fabio

Human knowledge belongs to the world!
Quality is not achieved by chance!


Posted Thu, 19 March 2009 09:49:00 PM by tehlike
Filed under: NHibernate, poid, identifier

comments powered by Disqus
© NHibernate Community 2024