OpenRasta and CastleWindsor Concurrency Issue

Posted by gregsochanik on January 12th, 2012 – 3 Comments

A couple of months ago we discovered an issue in the 2.0.3 version of the OpenRasta project.

Heisenbug

To cut a long story short we noticed a Heisenbug in our search endpoints, which use OpenRasta as a business layer between our Api and the Apache Solr search engine.

Every now and then, with no apparent pattern, we would see a series of errors being thrown from Castle.Windsor. This was the error we saw:

System.IndexOutOfRangeException: Index was outside the bounds of the array.
     at System.Collections.Generic.List`1.Add(T item)
     at Castle.MicroKernel.Handlers.AbstractHandler
        .EnsureDependenciesCanBeSatisfied(IDependencyAwareActivator activator)
Checking the event logs on the live servers we noticed that this error always corresponded exactly with an application pool recycle. This then led us to think that the issue must be to do with application start-up.

DependencyResolverAccessor

OpenRasta has a concept of an IDependancyResolverAccessor, which exposes an interface allowing you to implement your own choice of Dependency Injection framework to set up your dependencies. OpenRasta can then resolve instances that have been added to the container at run time in the normal way.

Our DI framework of choice for this project was Castle.Windsor, which is a very mature solution, and also integrates very well with SolrNet. The stack trace for the error led us to the WindsorDependencyResolver, which then led us through to Castle Windsor’s own internal dependency store which uses a generic List<T>. It turns out that .NET generic Lists are not thread safe.

The DependencyResolver is set up as a Singleton, and therefore is only ever called once, at the start of the application. We then deduced that what must be happening is that at application startup, if a large amount of requests come through at the same time, they can access the same List<T>. This in turn can throw the backing array out of sync with the size of the list, resulting in the IndexOutOfRangeException we saw.

To illustrate this, I was able to write an Integration Test that used Threading to fire a large number of concurrent requests at it, each one newing up an instance of WindsorDependencyResolver to emulate application startup.

The Fix

To fix the issue, we needed to use the double-check locking pattern around the resolvers internal container. This ensures that there is indeed only ever one Container set up even if multiple threads access this on application start-up.

private static volatile IWindsorContainer _windsorContainer;
private static readonly object _synchRoot = new object();
public WindsorDependencyResolver(IWindsorContainer container)
{
    if (_windsorContainer == null) {
        lock (_synchRoot) {
             if (_windsorContainer == null) {
                   _windsorContainer = container;
             }
        }
    }
}
Note the use of the C# volatile keyword used to enforce read/write barriers around all access of the singleton IWindsorContainer. This removes the need to use .NETs Thread.MemoryBarrier().

This has been in production for 2 months and thankfully we’ve seen no repeat of the error!

  1. ghay says:

    Even though the code that creates the resolver is (double) locked, and only called once?

    https://github.com/7digital/openrasta-stable/blob/master/src/core/OpenRasta/Hosting/HostManager.cs

  2. Sebastien Lambla says:

    Note that the memory barrier is the preferred approach to this, volatile was the 1.x days approach when we couldnt specifically enforce a memroy barrier.

    See http://blogs.msdn.com/b/brada/archive/2004/05/12/130935.aspx

    Also note there’s a bunch of issues with the windsor kernel not being thread safe so that code was also patched / updated for that.

    Seb

  3. gregsochanik says:

    Cheers Seb, good to know, will replace it with the MemoryBarrier.
    I just wanted to write it up to ‘double check’ it was the same issue that Huddle were having with windsor. I did see a commit from them with a similar resolution..

  1. There are no trackbacks for this post yet.

Leave a Reply

*