Development

How we do “innovation time”

Posted in Development, Innovation on January 26th, 2012 by Rob Bowley – Be the first to comment

Assuming you had consent from up above*, you’d think it’d be a breeze to get an initiative like innovation time off the ground. Surprisingly at 7digital it took us three attempts before we got something to stick and speaking to someone else recently I found they’d had a similar experience. As we’ve had our “innovation time” initiative running successfully for over a year now I think it’s worth sharing how we’ve got it to work.

In our first two attempts we found the main reason it failed was putting too many barriers in the way.  We didn’t think it was right that you could work on production code, you shouldn’t do it on your own, it should be justified and approved by your peers, like a business case. Innovation really hates these kind of things. Ironically we thought having “sensible” rules like this would ensure people didn’t waste their time on things we didn’t think could be classed as innovation, but they were based on a fundamental misunderstanding of how innovation occurs.

If you’re even slightly aware of your history regarding invention and discovery you’ll see a large proportion came about by happy accident, mostly when trying to prove something completely different. It’s such a common event it has its own name - serendipity. I strongly believe you’ve got to have a really open mind and shouldn’t have any particular expectations or goals in mind apart from innovation itself.

So for our third attempt we tried to really strip it down. Here’s the rules we agreed:

Conditions of use of innovation time

  • There is no jurisdiction on how you use the time.
  • If you’re working on production code, or any application or tool which supports production code, you must do it properly i.e. follow the development team standards.
  • If you’re working on something unrelated to production code or existing applications you may work in any way you like. However if you wish for whatever you produce to be used in anger it must meet the usual standards (i.e. either write it to expected standards to begin with or rewrite it afterwards).
  • We should put anything suitable in the public sphere e.g. on GitHub and hosted where appropriate.

Allowance

  • You have 2 days a month you are able to request for innovation time.
  • Innovation time does not accrue – if you don’t take your allocation in a particular month it does not carry over.
  • If someone else does not use their allocation you cannot use it in their place.

Requesting innovation time

  • Request time through whosoff.com as “DevTeam Innovation Time” (we set up a special leave type).
  • It’s to the discretion of your lead developer (or head of development) to approve. (e.g. they may not approve if they consider that the ability for the team to function effectively will be compromised due to too many people being off or we’re really busy with something and need all hands on the pumps).

Accountability

  • You must create a wiki page with the details of what you did and what you learnt.
  • For each day you take against a particular activity you should have a diary-like entry with the date and who was involved.

Using our leave booking system was a particularly inspired move. It means innovation can be managed just like holiday – it shows up in everyone’s calendars and allows everyone to plan around your absence. It also means we can pull out stats on how much time people have been taking:

Interestingly out of a possible ~456 days in 2011 189 were used, making it “4.74% time”. Another particularly surprising observation was people took less innovation time when they were not feeling motivated about their usual work (we had a very painful database migration in the summer).

It really helps that we’re very team focused at 7digital (pair programming is really good for this) so it’s no great loss when any one person is away for a short time. It also really helps that we try very hard to work at a sustainable pace.

As for what’s been done we’ve had some really interesting and diverse projects. Many of them have simply been around investigating new technologies and ways of doing things rather than new product ideas (but we’ve had some of those too). I think in this respect a lot of the benefits are intangible, but that’s one of the interesting things about innovation – if you tried to measure it you’d kill it stone dead.

*I’ve no intention of going into the justification for initiatives like innovation time here. That would be another, very long article :)

Cucumber.js with zombie.js

Posted in Node.js, Testing on January 23rd, 2012 by Antony Denyer – Be the first to comment

I wanted to start looking at alternatives to our current set of cucumber feature tests. At the moment on the web team we’re using using FireWatir and Capybara. So I though I’d take at look at what was available in Node.js. Many people think it’s strange that a .Net shop would use a something written for testing Ruby or even consider something that isn’t from the .Net community. Personally I think it’s a benefit to truly look at something form the outside in.  Should it matter what you’re using to drive your end product or what language your using to test it? Not really. So what are the motivations for moving away from Ruby, Capybara and FireWatir?

In a word ‘flaky’, we’ve had heaps of issues getting our feature tests, AATs and smoke tests reliable. When it comes to testing, consistency should be king. They should be as solid as your unit tests.  If they fail you want to know that for definite you’ve broken something, rather than thinking it’s a problem with the webdriver.

It is with this aim in mind that I started looking at the following.

Cucumber.js is definitely in it’s infancy, there’s lots of stuff missing but there’s enough there to get going.

Zombie.js is a headless browser, it claims to be insanely fast, no complaints here.

First up we got something working with the current implementation of cucumber-js https://github.com/antonydenyer/zombiejsplayground. The progress formatter works fine and the usual “you can implement step definitions for undefined steps” are a real help. Interestingly rather than requiring zombie.js in our step definitions we ended up going down the route of implementing our own DSL inside world.js. We could have used another DSL like capybara to protect us from changing the browser/driver we use. This is currently done with our Ruby implementation, the problem is that we’ve ending up implementing our own hacks to get round the limitations/flakiness of selnium/webdriver and to date we have never ‘just swapped out the driver’ to see what happens when they run against chrome/ie. That said should you be using cucumber tests to test the browser? I don’t think you should. With that in mind we ended up implementing directly against zombie.js from our own DSL.

Extending cucmber-js https://github.com/antonydenyer/cucumber-js

There are a lot things yet to be implemented in cucmber.js one that gives me great satisfaction is the pretty formatter. Look everything is green!  It’s no where near ready for production but you do get a nice pretty formatter.

Thanks to Raoul Millais for helping out with command line parsing and general hand holding around JavaScript first steps.

OpenRasta and CastleWindsor Concurrency Issue

Posted in Development, OpenRasta, Search, SolrNet on January 12th, 2012 by gregsochanik – 3 Comments

A couple of months ago we discovered an issue in the 2.0.3 version of the OpenRasta project.

Heisenbug

To cut a long story short we noticed a Heisenbug in our search endpoints, which use OpenRasta as a business layer between our Api and the Apache Solr search engine.

Every now and then, with no apparent pattern, we would see a series of errors being thrown from Castle.Windsor. This was the error we saw:

System.IndexOutOfRangeException: Index was outside the bounds of the array.
     at System.Collections.Generic.List`1.Add(T item)
     at Castle.MicroKernel.Handlers.AbstractHandler
        .EnsureDependenciesCanBeSatisfied(IDependencyAwareActivator activator)
Checking the event logs on the live servers we noticed that this error always corresponded exactly with an application pool recycle. This then led us to think that the issue must be to do with application start-up.

DependencyResolverAccessor

OpenRasta has a concept of an IDependancyResolverAccessor, which exposes an interface allowing you to implement your own choice of Dependency Injection framework to set up your dependencies. OpenRasta can then resolve instances that have been added to the container at run time in the normal way.

Our DI framework of choice for this project was Castle.Windsor, which is a very mature solution, and also integrates very well with SolrNet. The stack trace for the error led us to the WindsorDependencyResolver, which then led us through to Castle Windsor’s own internal dependency store which uses a generic List<T>. It turns out that .NET generic Lists are not thread safe.

The DependencyResolver is set up as a Singleton, and therefore is only ever called once, at the start of the application. We then deduced that what must be happening is that at application startup, if a large amount of requests come through at the same time, they can access the same List<T>. This in turn can throw the backing array out of sync with the size of the list, resulting in the IndexOutOfRangeException we saw.

To illustrate this, I was able to write an Integration Test that used Threading to fire a large number of concurrent requests at it, each one newing up an instance of WindsorDependencyResolver to emulate application startup.

The Fix

To fix the issue, we needed to use the double-check locking pattern around the resolvers internal container. This ensures that there is indeed only ever one Container set up even if multiple threads access this on application start-up.

private static volatile IWindsorContainer _windsorContainer;
private static readonly object _synchRoot = new object();
public WindsorDependencyResolver(IWindsorContainer container)
{
    if (_windsorContainer == null) {
        lock (_synchRoot) {
             if (_windsorContainer == null) {
                   _windsorContainer = container;
             }
        }
    }
}
Note the use of the C# volatile keyword used to enforce read/write barriers around all access of the singleton IWindsorContainer. This removes the need to use .NETs Thread.MemoryBarrier().

This has been in production for 2 months and thankfully we’ve seen no repeat of the error!

OpenRasta is opening up to the community

Posted in OpenRasta on January 11th, 2012 by Antony Denyer – 1 Comment

Last Thursday a few of us from 7digital had a meet up with Sebastien Lambla author of OpenRasta.

As some of you may know we’re been writing all our new API endpoint using OpenRasta, we have a vested interested in ensuring the success of this project and as such are responding to the rallying cry with gusto.

So what’s going to happen? Essentially 7digital, along with Huddle and Neil Mosafi, will jumping on board to help out with development and maintenance of OpenRasta 2.x. Short term goals are to help people get started with OpenRasta. At the moment it’s not particularly easy to get going with the 2.1 code. The first thing to get up and running with is a build server, this is something that 7digital will be taking responsibility for. Our first aim is to build OpenRasta and publish _latest binaries with every push and make those binaries available in OpenWrap, NuGet and as binary downloads.

We’re really looking forward to working with everyone on OpenRasta and can’t wait to get stuck in.

Productivity = throughput and cycle time

Posted in Agile, Development on January 6th, 2012 by Rob Bowley – 1 Comment

I see lots of articles and discussions on how you measure the productivity of a development team. Having worked with cycle time and throughput for a few years now I’ve come to the conclusion that when combined these very simple measurements are a highly effective gauge of productivity.

Cycle Time

We measure cycle time as the count of working days between work starting (“in progress”) and completion (“done” or “live”) and is shown below per item over time. The chart below also shows a trend line (going down, nice) and error bars for the standard deviation:

Ideally you want to see the trendline on your cycle time chart going down, but more importantly you don’t want to see it going up. Shorter cycle times suggests we’re delivering value to the organisation quickly and do not have money unnecessarily tied up in inventory (unreleased code). It also means we’re more predictable and able to respond quickly to change.

To be truly useful cycle time has to measure the time it takes for things to really be done, not some “potentially shippable” nonsense. It’s also very simple to measure – you just need to track when the work starts and when it ends. We say a piece of work starts when it’s taken off the top of the prioritised queue and done when it’s been released to production (and verified). Sometimes we’ve done the analysis and requirements before hand, sometimes we haven’t (I don’t really care to be honest as we never allow more than 5-10 items in our prioritised queue so it’s not like we’re doing loads of unnecessary up front analysis or consider it a huge time drain).

Throughput

We measure throughput as a count of items completed per month:

Obviously the more work we do (with the same man power) the better (but probably less bugs and more features).

Individually these measurements are useful, but can be misleading if considered in isolation – cycle time going down is great, but if you’re only delivering one piece of work a month it’s not so great. Throughput can go up or down, but could be because you’ve more or less people in your team.

It’s hard to say if productivity can really be measured by just cycle time and throughput and being witness to metrics being gamed in the past (or causing undesirable behaviour) I’m wary of trying to improve them directly. However as a couple of really simple measurements we can track over time they’re a really good guide to support or refute anecdotal evidence as to whether anything we’re doing is having an impact on our ability to deliver. A great example is the two charts above. In around Sept/Oct they show one of our team’s cycle times going down and throughput going up, which was largely a result of changing tack and spending a considerable amount of time improving the build process and removing/fixing flaky tests.

What can you try to increase your team’s throughput and reduce cycle time? What can these measurements tell us about the impact of X happening?

There are of course many other ways to judge the effectiveness of a team and I wouldn’t want to get too caught up with just these ones, but it certainly pleased me greatly to see how the team above’s measurements improved and it certainly corroborated with the mood of the team since they managed to turn things around.

I’m of the mind increasing throughput and reducing cycle time is A Good Thing™. It generally means we’re getting more done, more quickly.

A Day in the Life of a 7digital Developer

Posted in Agile, Continuous Integration, Development, Testing on November 9th, 2011 by Paul Shannon – 2 Comments

We were invited to give a guest lecture at the University of Nottingham this week to form part of their 2nd year group project module. This invitation came about through discussion at previous group project open days when the parallels between small agile software development teams and the group project teams highlighted that agile methods might help students during this module.

Enthusiasm from our team to contribute to these lectures has actually led to the possibility a 3 lecture series from 7digital; with future lectures on Test Driven Development and Refactoring/Code Quality, the first was more about Kanban, Continuous Integration and XP/Agile practices.

The Presentation

We produced a short Prezi for the lecture which is available online directly on the prezi web site or should be conveniently embedded below.

The Content

To avoid providing a death by powerpoint presentation we often have different visuals to the actual content of the presentation. In this case we have a few text slides but the majority of the content was delivered during the presentation itself. To ensure that those that missed it still have the opportunity to gain as much information as possible we thought that we’d include it here on our blog.

Introduction

The presentation takes the form of a day in the life of a 7digital developer, in this case, James Atherton. Each “slide” in the prezi represents a different event during James’ day, starting with the sunrise over London and arriving at an almost empty office.

Stand Up Meetings

Each team has a stand up meeting at around 10am. The purpose of this meeting is to get a quick overview of the team’s current situation, find out if any development tasks have been blocked, assign the day’s tasks, organise who will be pairing with whom and assess any broken builds. We physically stand up for this meeting to maintain focus and stop it dragging on.

During the 2nd year group project this type of regular, focussed and short meeting could be invaluable to the success of the group’s project. If you focus on blockers as we do, then the whole team can swarm on any bottlenecks so that problems are solved quickly and value can be added to your software consistently. We use this meeting to organise pairing, which means you can easily swap pairs regularly to ensure that everyone gets a good understanding of all aspects of the system. When I did my project in 2002 we split the team into programmers, documenters and organisers which I can now see was a mistake – sharing responsibility and knowledge by regularly swapping is a much better strategy.

Kanban

The notion of pulling value through the various stages of development comes from the manufacturing of cars by Toyota. We use different styles of Kanban board for different teams but they all have something in common: features broken down into deliverable chunks and written on cards. The cards move from the left to the right with a focus on pulling things from the right. For example if there is something in the “deploy to live” column then this will get done before a new card is moved into the “in development” column on the left. You can see the examples of our boards in the presentation above.

The main benefit you’ll find by using Kanban in the 2nd year group project is the focus it provides and the preventation of overload. If you only work on one or two cards at a time, and deliver these to live before beginning any new cards then your software grows quicker and you can gain feedback sooner. Being able to get feedback from your users as soon as possible is invaluable to guide the features that you’re creating. This means that you can change the features in the system based on what the users are actually using the system for.

A good example is the failure of my 2nd year group project: we were asked to build a live catalogue of objects hat were present in a virtual reality world so assumed this meant a fully featured rendering of all the objects with some way of navigating. This wasn’t the case, the users only wanted to see the properties of the object in a simple tree view – had we delivered a basic text version first we would have known this was the only requirement but instead we spent weeks learning and developing in 3D graphics libraries, and our project was never finished, so the user never got a useful tool.

Kanban boards need to be visible and easy to use – we make ours with blutack, wool and index cards. All you need is 1 sq metre of wall space. Put one up in a team member’s house, or ask for some space on campus, or you could even try an online tool but you will lose the interactivity of it.

Test Driven Development

Not only will test driving your code ensure it works, it also means that the code’s API is loosely coupled as providing test coverage means that classes have to be open. One of the main benefits of automated testing though is the ability to easily re-run the tests and verify that your changes worked – this means you can make lots of small incremental changes to the software, making it easier to change, more readable and more robust with confidence. We use 5 levels of testing and we recommend you do the same. The same test driven development process applies to all levels; write a failing testing, make it pass, refactor to remove duplication and code smells, write the next test…

  1. Unit testing – covers the functionality of one a single unit of work, usually a single class or method, or the object’s collaborators and the messages passed between objects. You should have lots of these.
  2. Integration testing – ensures that parts of the system work together correctly e.g. the database adapter can read, write and transforms persistent data into objects correctly, or an external web service returns as you expect. You should have at least one of these per integration point.
  3. Acceptance testing – written using cucumber syntax (Given, When, Then) in plain English means that other stakeholders/users can describe how a feature will work by detailing their expectations. You should have a few of these per feature.
  4. Smoke testing – these usually cover the “happy path” use case of the application, for example, searching for an artist, making a purchase and downloading the mp3. They ensure that all aspects of the system are functioning together correctly and unlike most of the other tests, can assert that your live environment is working. You should only have a couple of these.
  5. QA – this level of manual testing through logical test plans, performance testing frameworks and randomised trialling is done by our QA team. You should ensure that you dedicate some time to QA as not all problems can be found by automated testing.

We are coming back for a lecture on Test Driven Development where two developers will test drive a simple application, live, in the lecture so that you can see how the technique works, and hopefully the benefits of it. This will be on Monday 28th November.

Pair Programming

We pair on most development tasks at 7digital as it promotes knowledge sharing, prevents mistakes, ensures you get the best ideas from the team and makes the whole process much more sociable. If we’d used pair programming during my 2nd year project I’d have enjoyed it a lot more as the whole team would have shared the knowledge rather than one person doing the bulk of the programming. If you pair with someone who is a more confident programmer then you’ll become more confident yourself, ensuring that by the end of the project the whole team can contribute equally to the production of the software, presentation and open day.

Continuous Integration

We use Team City to track our builds and it runs builds that will compile the application, run the 4 levels of automated tests and even build deployment packages that we can then automatically (using another team city build) deploy to our testing and live environments. On a new project we’ll firstly try and deploy a “walking skeleton”. This will usually be a basic status page that we ensure we can deploy all the way to live. Once we know we can easily and automatically deploy our code it means we can add features much quicker, turning around features on some projects in a few hours. We suggest you do the same, automate as much as you can early on, and you’ll be relieved when you are developing features, responding to feedback and fixing defects later in the project.

Other popular CI servers we mentioned are CruiseControl, Jenkins and Hudson.

Releasing and Work in Progress Limits

As we’ve already said, automate releasing and do it often. There is nothing more valuable then getting feedback from your users. Break features down into small deliverable chunks – if this means hard coding a response from an external service or database then do that, at least your users can try it out with the hard coded data. Breaking features down is hard, but keep trying and don’t be afraid to abandon something that is too large in lieu of mocking part of the system.

When creating a new feature make sure you collaborate with the users to define the acceptance criteria in cucumber syntax that you can then use to write acceptance tests. Also ensure that you are not breaking any work in progress limits by pulling from the right on your kanban board and only working on a limit number of features at once.

We don’t do any formal design, we’ll usually have a session by the whiteboard to give a high level overview of the architecture but we prefer to test drive the code and let the tests document the functionality. If you spend time designing the system completely up front you will be less reactive to change and find it hard to adapt when you get user feedback.

Blockers

Lots of things block our software development: dependant tasks, other teams, hardware issues and configurations. To ensure that the flow continues we “subordinate to the bottleneck” and swarm on the problem. This means the whole team will attempt to help solve the blocking issue rather than continue working and compound any issues. Blockers should be identified as early as possible – this is what the Kanban board and stand up meetings are for. I’m actually writing this blog post while the rest of the team are fixing a blocker – some of our test environments are down so deploying to these now would cause even more issues. Once the problem is fixed we’ll continue development but this focus on the blocker has created some slack time in the team that we can devote to other important tasks that are not related directly to development.

Further Reading

Googling terms like “Kanban”, “Stand up meetings”, “continuous integration”, “work in progress limits”, “test driven development” and “acceptance testing” should give you plenty of resources but some specfically useful books are:

Careers

We are always trying to grow our team with enthusiastic and intelligent people. We offer a number of positions for developers and even have an internship scheme. All the tips we’ve covered in our presentation should help you out with the 2nd year group project but above all, an understanding and experience in these areas will make you stand out from the crowd when it comes to looking for a career. If all graduates came to us with knowledge of TDD, CI, Clean Code and Kanban it would make expanding our team much easier:

http://about.7digital.net/Careers

Creating a basic catalogue endpoint with ServiceStack

Posted in OpenRasta, REST, ServiceStack, Uncategorized on October 17th, 2011 by gregsochanik – 1 Comment

Overview

Servicestack is a comprehensive web framework for .NET that allows you to quickly and easily set up a REST web service with very little effort. We already use OpenRasta to achieve this same goal within our stack, so I thought it would be interesting to compare the two and see how quickly I could get something up and running.

The thing that most interested me initially about ServiceStack was the fact that it claims out of the box support for Memcached, something we already use extensively to cache DTOs, and Redis, the ubiquitous NoSql namevaluecollection store.

Getting cracking

I set myself the task of creating a basic endpoint for accessing 7digital artist, release and track details. Whilst taking advantage of ServiceStack’s ability to create a listener from a console window so I didn’t have to waste time attempting to set it up via IIS:

class Program {
      static void Main(string[] args)
      {
             var appHost = new AppHost();
             appHost.Init();
             appHost.Start("http://localhost:2211/");
             Console.Read();
       }
}
As you can see this couldn’t be simpler. Whilst the thread is running, it will listen at localhost on port 2211 for incoming requests.

AppHost

Every ServiceStack implementation starts with the concept of an AppHost, which is a catch all class that exposes the initial setup of your service. For a console app based HttpListener setup It relies on you overriding the AppHostHttpListenerBase.Configure() method, which offers up a Container for access to the built in IOC. Funq is the weapon of choice in Service Stack.

It seems a shame that Service Stack doesn’t abstract away the responsibility of the IOC, allowing the developer the case to write their own IOC implementation as you can with OpenRasta, but the emphasis with ServiceStack is on speed (both of performance and setup) and Funq is perfectly adequate.

Routes

The AppHost is where it is suggested that you set up the concept of Routes, which like with the ConfigurationSource in OpenRasta, you set up your Resource – UriTemplate relationship:

Routes
    .Add<Artist>("/artist/details")
    .Add<Artist>("/artist/details/{Id}")
    .Add<Release>("/release/details")
    .Add<Release>("/release/details/{ReleaseId}");
As with OpenRasta, the resource is represented by a simple DTO. In exactly the same way (via the KeyedValueBinder attribute in OR) your DTO represents the incoming request parameters, or POST request representation of your resource.

Services

The actual service itself (the equivalent of the Handler in OR) is where the request gets processed. There are a couple of ways you can accomplish this, but I opted for the documented method of deriving from the RestServiceBase<TResource> class. From within here you can the override a set of “On” methods which ServiceStack routes the call through to depending on the verb used, for example:

public override object OnGet(Artist request) {
    // Service logic here
}
Those familiar with OpenRasta will recognise a similar concept in setting up of Handlers, but with a few subtle but important differences. OpenRasta successfully decouples the concept of a handler from your implementation by allowing you to tie it to a resource, which from a clean code perspective I prefer.

OpenRasta also more importantly does not assume that you will be implementing all http verbs within a handler, and returns a valid 405 Method Not Allowed if you have not implemented that method for a service.

IOC implementation

ServiceStack’s default Funq works very well and you can opt for both constructor injection or property injection. Ctor injection is our (and should also be your) preferred way of achieving this, and Funq handled this perfectly. As mentioned earlier, you set up your container within the AppHost. You can then use the familiar Register<TInterface>(ConcreteInstance) to set up your dependencies..

MediaTypes / Features

ServiceStack’s great selling point is the ability to set up a “verticle slice” of a site incredibly quickly, and this it does without fail. Once I had my resource DTOs, service and AppHost set up I was able to access it immediately. It also supports many different media types out of the box, which can be turned off and on within the AppHost like so:
SetConfig(
   new EndpointHostConfig
   {
      EnableFeatures = Feature.All.Remove(Feature.All)
      .Add(Feature.Xml | Feature.Json | Feature.Html),
   }
);

Caching and ReDis

ServiceStack is true to its word that it supports a caching layer out of the box, and it is really easy to set up. It comes with its own MemoryCacheClient which works well as a basic .NET IDictionary implementation of a cache. It supports TTLs but not sure about LRU (least recently used) or other caching strategies. Each Cache class implements the ICacheClient class, and you just set up the dependancy in the AppHost in the normal way. You can then inject it into your caching service as normal. ReDis works in exactly the same way, and it does just work. I was very impressed with it’s implementation. It would have been nice to test its Memcached setup, but that didn’t come with the latest release, it’s only available within the latest cut from github. I downloaded it, but due to some initial setup issues ran out f time before I could play with it. It’s essentially an adapter around the Enyim library which we already use for our Memcached setup.

Pipeline vs ResponseFilters

OpenRasta allows you the ability to “hook into” various stages of its pipeline process outlined here. I wanted to see if ServiceStack did the same and I got very excited when I saw the concept of ResponseFilters, which again are set up in the AppHost. Sadly I ran out of time on this, I wanted to try and implement a “catch all” way of dealing with my http response codes issue as mentioned above, but this could be something I investigate further at a later date.

HttpStatusCodes

My biggest issue with ServiceStack after poking around a bit revolved around status responses. If, within a service, you have not overridden a method for a specific http verb then you do not get a nice instant 405 Method Not Allowed response. You instead get a 500 Internal Server Error with the available mime-type representation of the error object and stack trace. In an attempt to rectify this, I ended up creating an interim ErrorHandlingRestServiceBase<TResource> abstract class as follows:
public abstract class ErrorHandlingRestServiceBase<T> : RestServiceBase<T>
{
     protected override object HandleException(T request, Exception ex) {
         if (ex is NotImplementedException)
             return new HttpResult(ex)
                    { StatusCode = HttpStatusCode.MethodNotAllowed };

         return base.HandleException(request, ex);
     }
 }

The service then derives from this. Not the most elegant solution, but the only way I could see you it could be done. Having to implement functionality through inheritance rather than loosely coupled hooks can lead to complexity over time.

CustomSerialization

Another thing I didn’t get a chance to look at in more detail was customising your final mime-type related representation of your resource on the way back to the client. OpenRasta handles this excellently through the concept of a Codec, which is hooked up as a representation of a Resource with the ResourceSpace.Has syntax. This helps to leave the implementation of the request decoupled from the representation of the resource.

ServiceStack doesn’t seem to have an equivalent of this concept. In an attempt to ease you into an out-of-the-box implementation, it takes care of this all for you.

Summary

In my opinion, ServiceStack does deliver on its promises, it’s intuitive, user friendly and quick to set up. I’m sure if I’d had as much time with it as I have had with OpenRasta, I’d have found out ways around the issues outlined above.

Currently I don’t see anything that would prompt me to think about using it instead, but as a simple framework to quickly get an application up and running it’s definitely a winner.

The project is available here

Links

Search

Posted in API, Search, Solr, SolrNet on October 13th, 2011 by Mark Unsworth – 2 Comments

We will be the first to admit that our search has been far from optimal for some time, it’s something that’s frustrated us as much as it has our users. Unfortunately the unprecedented growth of 7digital has taken its toll on the original search infrastructure that powered our platform for the last 7 years – that’s right we were 7 this year.

A few weeks back we quietly made some changes to the artist and release search. These changes have been in the works for several months and has improved the quality of our search results as well as the speed in which those results are returned. Alongside the improvements to quality and speed we are also now returning accurate pricing for releases across all of of our catalogue, something that we haven’t been able to do previously.

Architecture

The main reason for the improvements has been our move away from using SQL Server Full Text search to using the open source Solr search platform.  Solr is a super fast open source search server built on top of the Lucene search library. We’re also using SolrNet to be able to index and query Solr from our .NET codebase – more on our SolrNet usage here.

We have a master-slave set up where we index all of our documents (~40m) to a single write-only master. This is then set to replicate out to a number of read-only slaves. We aren’t currently sharding the data across the slaves so they are exact mirrors with HAProxy in front of them to balance the load.

Load Balancing

We originally went with a round-robin approach to load distribution but realised that we were potentially caching the same query on each of the slaves so used the balance url_param feature of HAProxy. This means that the same query is always requested from the same slave. Average query times were reduced by 50% from this change alone. The graph below shows the avg response time dropping off and stabilising once the change had been made.

Improving Quality

We haven’t had to do a great deal to improve the relevance of the results returned as Solr gives you this for free, but we have been investing time in looking at the ways our users are searching and seeing what we’re missing from our index. Better logging of search requests should allow us to be able to understand more about where customers are not finding what they are looking for. We’ll blog more about this when we start work on it.

Speed Improvements

Our average search response times are now currently less than 200ms. This is a significant improvement from the days of SQL Server Full Text search when the average query time via our API was around 2 seconds and also on the initial implementation of search on top Solr which had average query times of around 500ms.

The image below, taken from our New Relic dashboard for our Search API, shows the last months stats for the Search API. The left hand chart shows the average reponse time (lower is better), the top right shows the Apdex (performance) score (higher is better) and the bottom right shows the amount of requests per minute we are seeing.

API Search Traffic 4/9 - 4/10

To put this into perspective, if you search for ‘Lady Antebellum’ on Google it takes around 200 milliseconds, but through our API it only takes 58ms  - ok so Google do return a result set of 54 million pages but they don’t show our artist page at the top!

Future Plans

We will be making more improvements to search over the coming weeks and months including a long awaited update to the track based search.

HATEOAS Console: an innovation project

Posted in REST on September 20th, 2011 by Matthew Butt – 2 Comments

At 7digital we have 2 days’ innovation time every month. During this time we can work on our own pet projects. This post is about my current project.

You can find the source of this project at https://github.com/bnathyuw/Hateoas-Console

Introduction

RESTful web architecture is becoming increasingly influential in the design of both web services and web sites, but it is still very easy to produce half-hearted implementations of it, and the tools that exist don’t always help.

In this project, I want to address this problem by building a new REST console that will:

  • Reward good implementations by making it easy to take advantage of all their RESTful features;
  • Help improve less good implementations by exposing their shortcomings.

Basic principles of a RESTful interface

Richardson and Ruby (2007 pp. 79 ff.) present a good analysis of RESTful interface design. Drawing on Fielding (2000 s. 5), but with a focus on actual practice, they identify four key principles or Resource-Oriented Architecture:

  1. Addressability;
  2. Statelessness;
  3. Connectedness;
  4. Uniform Interface.

Addressability means that any resource in the application that a consumer could want to know about has at least one URI. This criterion is fairly coextensive with Fielding’s Identification of Resources requirement.

Statelessness means that every request should contain all the information needed for its processing. This overlaps with Fielding’s requirement that messages be self-descriptive, and that hypermedia be the representation of application state.

Connectedness means that each resource representation should give addresses of all related resources. The most effective way to ensure connectedness will often be to produce an entry-point resource, from which it is possible to navigate to all other resources. This furnishes the other part of Fielding’s requirement for hypermedia as the engine of application state.

Uniform Interface means that all resources can be manipulated in the same way. For web services, this almost invariably means using the HTTP verbs, viz DELETE, HEAD, GET, OPTIONS, POST, PUT &c. This principle supports Fielding’s self-description criterion, and specifies the means of manipulation of resources.

Most REST consoles are fairly successful in accommodating principles 1, 2 and 4, but fail significantly in accommodating principle 3. Under Fielding’s terminology, existing REST consoles give little support for hypermedia as the engine of application state (HATEOAS).

Existing REST consoles

There exist several good consoles for manually consuming RESTful services. These include:

Simple REST Client for Chrome Simple REST Client for ChromeREST Client for Firefox REST Client for Firefoxapigee apigee

All of these clients work on a similar model: you enter a URI in the address box, choose an HTTP verb and click a button to send the request. You also have the option of adding headers and a request body. The headers and content of the response are then displayed on screen for the user to inspect.

How these consoles support the REST principles

Addressability

Addressability is a core notion in these consoles: the address box is a primary part of the UI, and you have to enter something here in order to make a request.

Statelessness

Statelessness is perhaps the easiest of the four principles to achieve, as the consoles operate on a request-response model.

In fact, what is useful in a console is the very opposite of statelessness: the console should be able to remember your preferences so that you do not have to enter them for each request.

With a significant exception discussed below, all three consoles do a fair job of remembering your choice of headers from one request to another, which takes some of the burden off the user. Apigee and REST Client for Firefox are also able handle OAuth authentication, which is a nice feature.

Connectedness

None of the consoles deals successfully with connectedness. If you want to follow a link from the response, you have to copy the resource URI into the address box and submit another request.

Apigee differs from the other two consoles in having a side panel which lists the principle URI schemata for the service under test. This initially seems like a helpful feature, but has several unfortunate consequences:

  • Apigee uses WADL to create its directory of links. This encourages a return to the RPC-stle of service architecture, which thinks of a web service as being made up of a limited set of discrete endpoints, each with a particular purpose, rather than an unlimited network of interconnected resources which can be manipulated through a uniform interface.
  • As the endpoints are listed in the directory panel, it is less obvious when a resource does not contain links to related resources.
  • Apigee has no way of filling in variable parts of a URI. If, for instance, you click me/favourites/track_id (PUT), it enters https://api.soundcloud.com/me/favorites/{track_id}.json in the address box. You then have to replace {track_id} with the specific track ID you are interested in. This is of course no help if you don’t know which track you want to add to your favourites!
  • Each endpoint is listed with a .json suffix, no matter what format you have just requested. Also, any request headers you have filled in are forgotten when you click on a new endpoint.

These shortcomings not only make the console frustrating to use, but also encourage non-connected, RPC-style architectural decisions.

Uniform Interface

As with Addressability, the Uniform Interface is at the core of these consoles. The HTTP verb selector is prominent in each UI, and it is easy to switch from one to another.

Apigee supports GET, POST, DELETE and PUT, Simple REST Client for Chrome adds support for HEAD and OPTIONS, and REST Client for Firefox adds support for TRACE, as well as several more obscure verbs.

What none of these consoles does is make any attempt to figure out what representation of a resource should be submitted in a POST or PUT request body. This is particularly surprising in Apigee, as this information should be available in the API WADL document.

Conclusion

There are close points of comparison between a REST console and a web browser: each is designed to make requests from a particular URI using one of a small number of HTTP verbs, and then display a representation of that resource to the user. What makes a web browser so powerful — and indeed was one of the founding principles of the internet — is that the user can click on links to get from one page to another. When you the primacy of the clickable link to the success of browsers it becomes all the more puzzling that REST consoles do not implement this functionality.

The Project

Basic principles

The purpose of this project is to attempt to address some of the shortcomings of the currently available REST consoles, while retaining their good features:

  • The basic format of the existing consoles is successful: an address box, and verb chooser, and a send button;
  • Rendering all details of the response is also vital; REST Client for Firefox gives you choice of viewing raw and rendered data, which is a nice additional feature;
  • The client should support as wide as possible a range of HTTP verbs, encompassing at least GET, POST, PUT, DELETE, OPTIONS, HEAD;
  • The ability to remember headers is very useful and should be kept, especially when clicking on a link;
  • OAuth integration is a nice feature and worth implementing if possible;
  • It would be very useful for the console to make a reasonable attempt at figuring out the response body format for PUT and POST requests;
  • Reliance on a WADL document encourages unRESTful thinking and should be avoided.
  • All appropriate links in the response body should be identified, and it should be simple to make further requests and to explore the API by clicking on them.

Implementation decisions

I decided to implement this project in HTML and JavaScript, as this seemed the most portable platform. I am working on the assumption that the finished product will be a chrome extension, as this lets me make some simplifying assumptions about the capabilities of the browser environment, and may also help solve some security issues.

References

My First Fortnight – Switching Agile Teams

Posted in API, Agile, Development on September 14th, 2011 by Paul Shannon – 1 Comment

I’ve always encouraged new starters in my previous team to write a post summarising their first impressions, so after starting at 7Digital 2 weeks ago I thought I’d do the same. While the aforementioned new starters are usually fresh faced graduates, I’m more a lived-in, agile, worked-at-the-coal-face, TDD obsessed, software craftsman who played a major part in bringing agile principles and practices to my former team. I’ve joined the services team, working on the 7Digital API, the Asset Server and Accounts web app.

First Impressions

My first impressions of the new team, working environment and codebase are positive, with similarities to how I am used to working. One major difference I noticed was the number of tests – I’m used to having the odd regression level test backed up by mainly unit tests with fewer integration tests in favour of collaboration tests using mock objects. The approach here suits the products well and is fantastic for a new starter, as you can safely refactor or transform most of the code with confidence.

Outside In

In my first two weeks the majority of tasks I’ve been working on are around configurations, and higher level tests (acceptance and smoke tests). This has simply been down to the nature of work at the moment but has served me well in orienting myself with the systems. I was paired on  a task to create a diagram of dependencies for services in our team. This helped me understand the detail of the SOA environment that the team works with. It also provided a useful resource for the rest of the team so I was pleased that I was able to add value early on. Working on the acceptance and smoke tests was useful because of the old adage of “tests as documentation”. The behaviour style naming and separation of actions using cucumber syntax meant that I could understand the tests, and the API it was testing, quickly without too many questions. The simple language meant I could see what the system is supposed to do before I am fully immersed in the team’s domain.

Code Quality – Team Goals

I’ve seen a mixture of code quality because of the variety in levels of legacy code the company has (or “legacy, legacy code” as Rob puts it) but I’m pleased that there is a whole team mentality to improve the quality of any code we touch. It was a pleasant surprise when my suggestion of encapsulating a primitive Dictionary as a class with a single responsibility and internal state was met with enthusiasm rather than resistance. I hope my experience and attitude for improving code quality can help the team meet their goals, as I think I’ll fit in well here.

Friday Deployments

There is a strict rule here that deployments on a Friday are only for exceptional circumstances. I like this policy, I had it in my last job where it was colloquially known as Shannon’s Second Law (Shannon’s First Law being attributed to Claude Shannon and describes maximum transmission bandwidth for a channel with noise).  I was surprised to see an emergency deployment on my first Friday though, and thought I should record it here for posterity ;)

Sustainable Pace

The pace here is a little slower than I’m used to, yet we seem to be adding a great deal more value. It makes the working environment relaxed which helps to maintain focus when you need to. The frequency of commits to the master branch, continuous integration runs and high quality code mean that this practise really works well here, I’m just trying not to be too eager and adjust to the natural flow.

Friendly Folk

There is a very social atmosphere amongst the teams here, from the sharing lunch around the sofas, to getting a beer in on a Friday. Even the odd flame war is done in good humour. It makes a particular daunting time far easier for new starters.

I’ll probably add some posts in the future on my experiences here, along with some more technically insightful nuggets – for now though you could always follow me @BlueReZZ