Welcome to Blogs From The Geeks, Intermittent insightful nuggets.


7digital @ University of Nottingham Open Day

Posted in Community Events on May 11th, 2012 by Paul Shannon – Be the first to comment
Open Day 2012

Following our recent lectures at the University of Nottingham, the Computer Science School invited 2 of us to help judge the Alumni Prize at the 2nd Year Project Open Day on 9th May. The day is the pinnacle of the group project course as each team exhibits the software they’ve developed over the previous year at a mock trade-fair. The computer lab is transformed for the day into a trade-fair with pull-up banners, posters, screens demonstrating products, iPhones, Android tablets and all manner of gadgetry to show off the development skills of the 2nd year students.

The Prizes

There were two groups of judges, with our group being tasked with judging the stalls, demonstrations and “sales patter” on the day rather than the software, which was left to representatives from the University and IBM. This meant we could be easily swayed by the amount of cakes and sweets on offer for each stall, but we kept our professionalism and judged based on the quality of the demonstrations and enthusiasm of the students.

Paul Shannon handing our top quality 7digital prizes
Happy Winners

Our top prize went to a group who developed a rhythm game for MIDI keyboards in a similar manner to Rock Band or Guitar Hero. We were impressed by how engaging the stall was, inviting visitors to attempt a track and have their score on the wall of fame. The demonstration and explanation of how the software was built, how it works and the problems and pitfalls the team had faced was excellent. Above all though, the enthusiasm of the team for their project and the open day made them stand out from the other groups so they were rightly awarded the alumni prize and the coveted 7digital t-shirt.

Other notable projects included a Settlers of Catan game, an online marking web site for lecturers and students, an epub to PDF converter for people that love a dead tree book, a Last.fm faux-3d genre tag mash-up and “hot or not” style web site that tried to rate MPs based on the “friendliness” of their faces. We were impressed by the level of innovation and variety of projects on display. Having projects that were relevant to our industry was a bonus too and we promptly spoke to some groups about our internship and apprenticeship schemes.

Come And Work With Us

For those at the event that we didn’t get chance to speak to, I should mention our Technical Academy and Internship schemes. We’ve recently started the 7digital Technical Academy with the idea of taking on graduates with no commercial software development experience. We use Agile principles and practices with test driven development and find that it is difficult to find people with experience in this area. The academy will help you to become a 7digital developer and help us to get enthusiastic and intelligent graduates working with us despite having no experience. With a mixture of class room sessions, pair programming and an assigned mentor we hope our apprenticeship scheme will help extend your education at the University of Nottingham to give you the skills you’ll need to work in a fast-paced, start-up in the heart of London’s “Silicon Roundabout”.

More details on the Technical Academy and the Graduate/Apprenticeship Scheme are on its own special page.

More details on the Development Team Internship are available on our jobs page.

Development Team productivity at 7digital

Posted in Uncategorized on May 9th, 2012 by robbowley – 4 Comments

Some statistical evidence for benefits of Agile principles & practices

At 7digital we started moving to what could best be described as “Agile” practices & principles around three and a half years ago. Our approach focuses on self-organising, empowered product delivery teams with the overriding objective being maintainable, sustainable development.

At a more granular level this includes a focus on practices such as Continuous Delivery, Continuous Improvement, Kanban & Theory of Constraints, Systems Thinking, Test Driven Development, Refactoring, Pair Programming, and Emergent Design. We’ve talked about our experiences in more depth at events like XPDay 2009, LimitedWipSociety and QCon London 2012

We started recording data on the work being undertaken around three years ago and have amounted data for around 2,600 work items. We felt it was about time we started sharing some of this data, hopefully to offer encouragement to others that there are some tangible benefits to the practices and principles largely grouped under the Agile Software Development umbrella.

The latest report is for the period of a year between April 2011 and April 2012. Within the report we include comparisons with the first two years of data, which were aggregated into a single report ending April 2011 and have not yet published externally.

Click here to download the 7digital Development team statistical analysis report April 2011-2012

The report shows the trend for significant improvements over the period, which follows similar improvements seen in the first two years. The report contains more detail and some analysis, but basically we’re doing a lot more work and we’re delivering it more quickly – something we largely attribute to the practices we’ve adopted.

Some key figures regarding the improvements we’ve seen over the last year:

  • a 43% improvement in Cycle Times for all work items
  • a 50% improvement in Cycle Times for feature work
  • a 79% increase in Throughput for all work items
  • a 65% increase in Throughput for feature work
  • a 6% reduction in the proportion of bugs to features (but an increase in bugs overall)
  • a significant reduction in cycle time variance

Some other things we can say at a more atomic level over the period:

  • the average cycle time for a feature/MMF is 5.7 working days
  • the average cycle time for all work items is 4.5 working days
  • we average around 111 completed work items in a month (of which 45 are features/MMFs)
  • the Cycle time variation for all work items is 5.5 days

Here’s some pretty graphs (lots more in the report):

Note how cycle time decreased most significantly in the first two years and has been pretty stable at a low level since.

Next up a very healthy trend for an increase in throughput:

Which interestingly had remained stubbornly resistant in the first two years:

Our biggest disappointment has been the comparative lack of improvement in bugs, but the great thing about doing this analysis is we have very good data on where most of these bugs are being created meaning we can focus our efforts in particular areas.

We’re proud of the progress we’ve made and at the same time excited because we feel there’s still a lot of room for improvement. Hopefully next year we’ll publish a similar report for comparison.

We hope you find this information useful. We would of course be interested in any feedback or thoughts you have. Please contact me via twitter: @robbowley or leave a comment if you wish to do so.

How we do deployments at 7digital.

Posted in Continuous Integration, How To, Ruby on April 28th, 2012 by Hibri Marzook – 2 Comments

At 7digital, deployments to a production environment are a non-event. On a given day, there can be at least 10 releases to production during working hours. On some days even more. Specially on Thursdays before the 4pm cut off, as we don’t deploy to production after 4pm and on Fridays.  

Deployments to our internal testing environments happen constantly, on every commit. I’m able to deploy a quick fix, or patch something in production without having to make changes on a live server. We rarely roll back, instead roll forward by fixing the issue. This is made possible due to our investment in build and deployment tools. We attempt to treat these with the same care as our production code.

This post is about how it works.

A little bit about our stack.

Our services run on the .Net stack, with SQL server back ends mostly. We use IIS 7and IIS 6, load balanced  behind HAproxy.

We use Teamcity, to trigger Rake scripts, to do our deployments. The Albacore gem is used for the majority of tasks. We use code derived from Dolphin deploy to configure IIS.

 

In the beginning.

We used MSBuild for building our solutions and deploying software. However, this was very painful, and led me on a personal crusade to get rid of Msbuild for deployments.  XML based build frameworks, limit what you can do to what is defined in the particular framework’s vocabulary. A big pain was having to store configuration and code in the same msbuild xml files. It wasn’t possible to separate the two without writing custom tasks.

A build framework, in a programming language, allows you to be much more fluent and write readable scripts. You have the power to use all the libraries,  at your disposal to write deployment code instead of being limited to a XML schema definition. In addition to Ruby, we have  a couple of projects using Powershell and psake.

 

The current setup.

 

deployment

 

The diagram above shows the major parts of our deployment pipeline.

We keep the build and deployment code along with the project code, to maintain a self contained package, with minimal dependencies on anything else.

A project has a master rake file, named rakefile.rb in the root directory of the project. This rake file references all the other shared rake scripts and ruby libraries needed for build and deployment.

These libraries and scripts are kept in a sub directory named build. A typical project structure is like;

root
    build
        conf
        lib
   src
       XX.Unit.Tests
       XX.Integration.Tests
       XX.Web

The conf directory contains the configuration settings for IIS, including the host headers, app pool settings and .net framework version settings.

The Albacore build gem has everything that is needed to build a .Net solution. We use it to compile our code on Teamcity and to run our tests.

When something is checked into VCS (git), Teamcity triggers off a build and compiles the code. This build process will package the deployment scripts and the web site package, which will be used for deployment. Teamcity stores these as artefacts, and this allows us to reuse them without building again.

To deploy a website, a Teamcity build agent, retrieves all necessary zipped packages, un-compresses them to the current working directory.

The build agent calls a rake task, with the parameters;

   rake deploy[environment, version_to_deploy, list_of_servers, build_number]

An example

   rake deploy[“live”,”1.2”,”server1;server2;server3”,”123”]

The environment parameter specifies which deployment settings to use.  Deployment settings are stored in  YAML files, that the rake scripts read.  A YAML file for IIS settings looks like;

uat:
site_name: xxx.dns.uat
host_header:
     80:xxx.dns.uat:*
     443:xxx.dns.uat:*
dot_net_version: v4.0
live:
site_name: xxx.dns.com
host_header:
     80:xxx.dns.com:*               
     443:xxx.dns.com:*
dot_net_version: v4.0

 

We can add a new environment, and change settings for an existing environment by changing a configuration .yml file, without having to change deployment scripts.

The version_to_deploy parameter, loosely translates to a virtual directory. This is ignored for websites that deploy to the root. The list of servers is an arbitrary list of servers that we deploy to. This allows us to deploy to a single server or a cluster.

The rake deploy task, calls two other rake tasks, for each server in the list of servers. The first task is to copy all deployment scripts and the web package to the target server. The second is to trigger a remote shell command to do the actual installation process.

In pseudo code

  deploy
       foreach(server in servers)
             copy scripts and packages to server
             trigger remote installation
on server            

The actual installation process does not happen from the build agent, but on the target server. The build agent does not have the the necessary network access and admin rights. Our servers expose only SSH.

The deployment sequence is controlled by chaining rake tasks. This allows us to run any of the tasks individually from the command line to do a manual deployment or to test.

The remote installation task, copies all the web site binaries to the correct locations under IIS, and configures IIS. Application pools under IIS are stopped, while this happens, and the virtual directory and if needed the web site are rebuilt. The application pools are restarted after this.

The deployment  repeats the process on the next server in the list.

 

The future.

What we have now helps us a lot, and allow us to scale up to this point. However, to grow even more there are a few things that hold us back.

For example, a lot of infrastructure details creep into our configuration and scripts and stored in source control, which is mostly used by devs. This means that  when our operations folk make a change to the infrastructure the devs have to change our configuration settings to reflect this. I would like to have all configuration settings stored somewhere, and the scripts would call out to a service to get all the settings for a particular environment and application. This service would be maintained by devops, and will be synchronized with changes made to the infrastructure.

The same can be done for the list of servers. Instead of a developer having knowledge of what servers comprise an environment, the script could ask the same service, to give a list of servers that are in a given environment. This will allow us to scale transparently, by adding a new server to the list and doing a fresh deploy.

 

Summary.

I’ve tried to capture an overview of how we deploy our software at 7digital. There is a lot of detail I haven’t gone into. Especially the nitty gritty of setting up IIS host headers, ports and app pool settings. A build and deployment framework is something we do from day one of any new project. We make sure that we have a skeleton application deployed all the way to production before any new code is written.

Feel free to get in touch if you have any specific questions.

Resources:

http://codebetter.com/benhall/2010/10/22/dolphin-deploy-deploying-asp-net-applications-using-ironruby/

http://albacorebuild.net/

7digital shared playlists built in node.js

Posted in Innovation, Node.js on April 23rd, 2012 by mikelam – Be the first to comment

Introduction

7digital shared playlists is a real time web app I made that enables users to listen and add tracks to the same playlist. Please open it up a couple of browsers to see how the real time functionality works.

http://electric-summer-3784.herokuapp.com/

It’s the result of maxing out my two days of innovation time this month at 7digital, plus a little extra over the weekend as two days never seems enough!

Source Code

https://github.com/treadsafely/node-js-7digital-shared-playlists

node.js and WebSockets

The tech stack I chose was node.js with WebSockets for real-time communication. WebSockets utilise a much reduced overhead in server communication than ajax does, and node.js can comfortably handle all of the websocket connections via it’s single-threaded, non-blocking event loop architecture.

Not all browsers support WebSockets, which is where the socket.io comes in. Socket.io uses WebSockets if they are available, but if not, it will use Flash sockets, and if this is not available, it uses long polling. Now.js (another node module) is built on top of the socket.io module. Now.js makes it really easy to sync your variables and functions from client to server side and back with the “now” object that can be accessed at either the client or server side.

Web framework

I used express, a lightweight web framework that makes it really easy to create the http server, create the routes and render views. I used a typical MVC architecture to structure the application.

Audio

Audio is handled by audio.js. Audio.js enables the HTML5 audio tag to be used anywhere, regardless of browser support, by using a flash player if the audio tag is not supported.

Persistence

MongoDB plays very nicely with node.js. Mongoose is a node.js wrapper for MongoDB, and makes for very easy persistence of JavaScript objects. Current users, chat history and playlist tracks are all persisted within the application.

Mapping XML to JSON

Unfortunately, the 7digital API does not yet support JSON. I used node-xml2json to do the mapping.

View templating

I used Jade, pretty much the de-facto standard in view templating in node.js. It’s incredibly intuitive, very readable and supports express out of the box.

Deployment

Hosting is taken care of by heroku. I wanted to use JoyentCloud to host as it has full support for WebSockets, but they are not provising new smart machines due to being at full capacity. Heroku does not have WebSocket support and uses long polling instead. Not great, but the resultant effect is pretty much the same.

Conclusion

The 7digital shared playlist app was a lot of fun to make, and hopefully a demonstration of the power of real time apps – providing value by enabling online communities to interact without the requirement of refreshing the page, or the server overhead of concurrent ajax calls or polling.

Metric Driven Development Fueled by StatsD and Graphite

Posted in API, Agile, Innovation, Metrics, Testing on April 18th, 2012 by goncalopereira – Be the first to comment

Why metrics?

Since I joined 7digital I’ve seen the API grow from a brand new feature side by side with the (then abundant) websites to be the main focus of the company. The traffic grew and grew and keeps on growing in an accelerated pace and that brings us new challenges.

We’ve brought the agile perspective into play which has made us adapt faster and make fewer errors but:

  • We can do unit tests but they don’t bring out the behaviour.
  • We can do integration tests but they won’t show the whole flow.
  • We can do smoke tests but they won’t show us realistic usage.
  • We can do load test but they won’t have realistic weighting.

Even when we do acceptance criteria we are actually being driven by assumptions, even with an experienced developer he is really just sampling all his previous work and as we move to a larger number of servers and applications it’s not humanly possible to take all variables into consideration.

It is common to hear statements like ‘keep an eye on the error log/server log/payments log when releasing this new feature’ but when something breaks it’s all about ‘what was released/when was it released/is it a specific server?’. As the data grows it becomes harder to sample and deduce from it quickly enough to feedback without causing issues, especially when agile tends to implement intermediary solutions which might have different behaviours from the final solution that have not been studied.

The truth is that nothing replaces real life data and statistics – including developers opinions – if it the issue is a black swan then we need to churn out usable information fast!

Taken from @gregyoung

This has been seen before by other companies; for example, Flickr on their Counting and Timing blog post. See also Building Scalable Websites by Flickr’s Cal Henderson.

This advice has been followed by other companies like Etsy on their Measure Anything Measure Everything blog post or Shopify on their StatsD blog post.

How to do it?

Decided to start with a winning horse I picked up the tools used by these companies:

StatsD is described as “a network daemon for aggregating statistics (counters and timers), rolling them up, then sending them to graphite”.

Graphite is described as “a highly scalable real-time graphing system. As a user, you write an application that collects numeric time-series data that you are interested i[...]. The data can then be visualized through graphite’s web interfaces.”

The way to implement these is available in several tutorials and I used StatsD own example C# client to poll our own API request log for API users, endpoints used, caching and errors.

In the future it would be ideal for the application to access StatsD itself instead of running a polling daemon.

There are a lot of usable features on Graphite. The ones I’ve used so far include Moving Average which will smooth out spikes in the graphs making it easier to see behaviour trends in a short time range and Sort by Maxima.

There are even tools to forecast future behaviour and growth using Holt Winters Forecasting Statistics and this is used by companies to understand future scalability and performance requirements based on data from previous weeks, months or years (seen in this Etsy presentation on Metrics)

How it looks and some findings

Right away I got some usable results. An API client had a bug in their implementation which meant they required a specific endpoint more often than they would use it – this data can help out with debugging and also prevent abuse.

Sampled and smoothed usage per endpoint per API user…

Another useful graph is error rates, which might be linked with abuse, deploying new features or other causes.

Error chart smoothed with a few spikes but even those are on the 0.001 % rate

Here is some useful caching information per endpoint to know how to tune up TTLs or look for stampede behaviour.

Sampled and smoothed Cache Miss per Endpoint

Opinion

After you start using live data to provide feedback for your work there is no going back. It is my opinion that analysis of short and long term live results of any type of work should be mandatory as we move out of an environment that is small enough to be maintained exclusively by a team’s knowledge.

Database schema evolution and its traceability

Posted in Databases, Development, Testing, Uncategorized on April 17th, 2012 by andresaragoneses – Be the first to comment

Versioning of database schemas is a tricky problem. And to solve it you normally not only have to deal with tools, but also with people in the work environment (the culture within the organisation basically).

In our case, we were using in the past a tool that was claimed as the most used in the .NET world for this kind of problems: Migrator.NET.

It worked well but obviously it wasn’t perfect, and combined with our environment it caused a lot of drawbacks, such as:

  1. Requiring you to use C# code to write every change you wanted to make in your database. For this, it had its own .NET API, but you could still even provide custom SQL scripts; however, even if you used this capability you still had to wrap it with some .NET code to wrap the call to the custom SQL script.
  2. Its API wasn’t very well designed, so writing tools or custom building/deployment tasks that interacted with it was hard. Real example: http://stackoverflow.com/questions/894120/how-do-you-tell-if-your-migrations-are-up-to-date-with-migratordotnet
  3. Did not contain any tools around it (ecosystem) or no community whatsoever: at the time of this writing, 50% of the last 4 messages of the mailing list are spam, and 25% of them were written in April 2011…, sources hosted on google code and github at the same time (assuming the latter move caused by the lack of contributions, etc.

Drawbacks 1 and 2, combined with the fact that our DBAs don’t launch a VistualStudio instance every day at work (no offense!), meant that over the years, changes were being done in the databases without the use of Migrator.NET, which was not only inconvenient because of the lack of traceability of the change, but also impossible to determine if the change was propragated sucessfully to all our testing environments and test databases. The chances of the latter happening would be exponentially higher if you take into account that some test suites depended on their own databases in some server, instead of creating and dropping them on the fly for each test run (because either they depended on some specific data in the DB, or because there was no up-to-date schema in version control that they could use to create it at the setup of the test run; both of them equally enraging reasons of course).

Therefore we not only needed to fix this problem (making it easy for DBAs to make changes with the same tool the developers use) but also the inconsistency problem that this problem had created.

So, let’s talk first about this problem: how to fix inconsistency between database schemas? (Differences between a database in the LIVE environment compared to other databases in testing environments.)

Inconsistency Test

Part I: Consistency of databases across environments

First, we need to detect these inconsistencies to be able to fix them obviously. How to detect them?

There are good commercial tools out there to do it, and we actually had purchased one (RedGate). It turns out to be very good at this job, however owning just one license meant that we could only run it in one server, so the access to it was limited. Plus, while considering a fix to the consistency of our database schemas, we thought that fixing it once wouldn’t prevent it to become out-of-sync in the future, even if we move out from using Migrator.NET to use a tool that everyone could use (not only developers). Unless… the solution chosen is built with automation in mind: wrap the comparison procedure around a test that can be run every once in a while.

At the time of thinking about this problem, somehow we missed the fact that RedGate had also command-line tools to compare databases, so we went ahead and wrote a proof of concept of a small test-suite that compared table schemas using the SMO API provided by the Microsoft SQL Server .NET SDK. The proof of concept was successful and useful enough that we evolved it into what it is today: a library that, among other things, provides an abstract class that allows you to compare databases (not only tables, but foreign keys, views, functions, stored procedures, and triggers!) by just creating an NUnit-based TestFixture-class that inherits from it:

namespace FooCorp.FooProject.Db.Tests
{
    [TestFixture]
    public class FooBarDbConsistencyTests : DatabaseComparisonTests
    {
        public override string DbName
        {
            get { return "foobar"; }
        }
    }
}
If you provide the proper config settings to connect to your database “foobar” in your app.config file, your “foobar” database will be examined and compared to a disposable DB.

Whooops, and now you may be wondering what is a disposable DB? Maybe I should have started this excercise with that first, sorry! :) Basically, a disposable DB is a DB that you create, just for development/testing purposes. As it is just for testing purposes, this means that you normally can throw it away after using it without problems, so this is why we call it “disposable”.

You can only create a disposable DB if you have the schema of the DB that you want to create, obviously, in text files (for convention purposes, I use the “.SQL” extension for these files).

To extract the SQL script files of the schemas of our production databases we used SQL Server Management Studio. With them, our tool basically reads them and executes them to create a new disposable DB, and after that it compares the schemas of each of the elements it contains (each element being an independent [Test] member of the [TestFixture]) with the “foobar” database specified in the config file.

Why doing it this way? Because:

a) This way we make sure at the same time that we don’t break our SQL files (the creation of the DB is tested before the comparison).

b) If we didn’t compare against a disposable DB, what two databases would you compare? You could compare each DB of every of your testing environments with the LIVE db, but if you did this, how would you make sure that the LIVE db doesn’t get out of sync with the SQL scripts that you have in version control? This is an easy way to make sure that the LIVE database doesn’t receive unwanted changes usually performed by cowboys ;)

Here you can see an example of a screenshot of one of our CI builds to track inconsistencies between environments:

DB consistency tests in TeamCity

We schedule them to run every day at midnight (because each consistency test is kind of long, and we don’t want to steal the build agents from our developers!).

In the next parts of these blog series then, I will explain the following topics:

Part II: How disposable databases are created behind the scenes

Part III: How to perform changes (migrations) in the databases without creating inconsistencies across environments

Part IV: Migration recycling

Part V: How to inject disposable DBs creation into our acceptance/integration tests

Yes, you can do all these things with the set of tools we created. And, as the migrations were a central part of the problem they try to solve, we decided to call it DatabaseMigraine ;)

Until the next parts of this blogging series arrive, you can already check out the code in GitHub.

Removing non-determinism from our acceptance tests

Posted in Continuous Integration, Ruby, Testing on April 3rd, 2012 by mikelam – Be the first to comment

The web dev team at 7digital has been on quite the journey with our acceptance tests: automated regression tests based upon acceptance criteria. All began in good faith 18 months ago with the start of a new project – to create a brand new 7digital website with a completely new codebase, look and feel.

The drop in confidence that came with non-deterministic tests

The complexity of these tests grew over the course of the next few months. Data setup and data access within our step definitions was implemented using a mixture of ActiveRecord and a C# console application called DatabasePopulator. The tests were run in a load balanced environment called systest, and were effectively end-to-end tests that tested the functionality of all internal and third party apis we connected to also.

Our acceptance tests started failing increasingly for non-deterministic reasons. Many things were blamed for this; the systest environment being too slow, data access interference across concurrent acceptance tests runs, the Capybara framework not being able to click buttons, caching, unreliable internal apis, asynchronous behaviour and more. The truth was that no one quite knew why on earth the tests kept failing. This damaged the morale of the team, and acceptance tests were often ignored in development or very poorly written.

The state of the acceptance tests deteriorated to the point where the team was on the brink of removing them altogether, and leaving a small handful as smoke tests. They were taking up huge chunks of time that could be spent developing new features, and we had deadlines to meet.

Discussions

We had discussions and meetings about what we wanted from these tests – what value did they give? The team argued somewhat, but the general consensus was that we were trying to test too much – doing end to end testing was not a good idea, and we should simply focus on testing logic.

End to end testing was not the goal. We decided we wanted our tests to tell us whether the code we had written as a team fulfilled the acceptance criteria. We ended up chasing our own tails by trying to test everything end to end. After all, it was not our teams responsibility to make sure that the apis we called did what they said. It was not our responsibility that the load balancer would always behave. The inter-connectedness of the shared databases underneath our internal shared apis left consistency of data out of our control. Each of these sub-systems we connected to and used each had their own unit, integration and acceptance tests, so why should we replicate them? Some argued that by stubbing out our api calls, how would we really know if it all really works? The team already had integration tests which checked the endpoints we stubbed, so we felt this was covered. We implemented a separate set of smoke tests, a small suite of end to end tests that tested happy paths that gave us confidence in interaction with outside systems, such as load balancer setup and purchase using test credit cards. These smoke tests were small in number and took only a few minutes to run, which meant that if they failed, they would fail fast, and could be rerun quickly as necessary if we felt the results were non-deterministic.

A clear definition of what we want to achieve

From our discussions, we coined the term, “Feature test”, with the definition, “Tests a piece of functionality against its specification. The tests should not cross boundaries outside of your control”.

Faster feedback

We isolated the tests that seemed most non-deterministic into their own build and vastly improved debugging feedback via the use of screenshots and html snapshots of failing steps. The screenshots provided immediate reasons for the failures – we were no longer blind men trying to describe an elephant to each other.

Isolation

Over the course of the next few months, we isolated the tests as much as we could. We removed caching, we took our feature tests off systest and into its own isolated environment free from load-balancing and network slowdowns. We created stubs for our api calls, we removed active record and automated the data setup before each test run, and we implemented the use of disposable databases. The determinism of our tests soared, and the team gained confidence in them.

The tests should be easy to run

We’ve spent time making the feature tests easier to run too. There is no longer any configuration or hard to remember command line parameters to remember. We’ve enabled our feature tests to be run from the Visual Studio Resharper test runner, and abstracted and automated all the configuration changes so anyone can run them.

My list of top ten actions to improve determinism

Here is a list of things, in order, which any team should try and implement, if they wish to improve the determinism of their acceptance tests.

  • Get buy in from your whole team that automated acceptance testing is a good thing.
  • Obtain a clear goal of what you want to achieve from these tests, and a clear definition of their purpose.
  • Set aside time, a good few weeks, to tackle the issue.
  • Fast debugging please. Screenshots and html outputs of all failed steps.
  • Isolate, isolate, isolate. Isolate your environment, stub your api calls, turn off caching and load balancing (use separate tests to test load balancing), isolate your database, and isolate your browser used for testing.
  • Reduce network calls to a minimum. Have the tests run on the same box where they are deployed. The same goes for your stubs – the stubs should be deployed on the same box also so all requests are to localhost.
  • Data should be setup and torn down for each run.
  • The database should be disposable, allowing for isolation and concurrent builds.
  • Ability to watch your tests as they are run by your continuous integration environment.
  • When a test fails, is should fail quickly.
  • Make it easy for QAs to run them – they often have a lot more patience than developers and can spot problems that we can’t. No change of configuration should be required in order for the tests to be run.

7digital Tomahawk Resolvers

Posted in API, Community Events, REST, Ruby, community on March 28th, 2012 by goncalopereira – Be the first to comment

Went to Amsterdam Music Hack Day and left with a working Tomahawk Resolver for 7digital previews, a spiked locker integration and some cool ideas on how to promote 7digital with Open Source

This will allow Tomahawk to use 7digital’s track search and listen to previews, in the future it will integrate buy buttons and help out Tomahawk to work with 7digital :-)

The locker integration is spiked using a local service which would be provided/maintained by either 7digital or a third party so it won’t be available immediately after the demo as there are feature, performance and security concerns.

What was used

http://developer.7digital.net/ 7digital API with the demo musichackday API client key

https://github.com/tomahawk-player Tomahawk resolver examples and documentation

API locker service built in Ruby with Sinatra and the 7digital ruby API client gem. Also uses the JSON gem.

Also built some local stubbed API responses using Ruby Sinatra as the connection was slow/failed sometimes :-)

Links

Don’t judge me on my JS..

https://github.com/goncalopereira/7digital-tomahawk-resolver working search with previews with musichackday key

https://github.com/goncalopereira/7digital-stubs stubs…

Service example code

This will authorise the user with a premium api account and get the locker, it is a slow implementation as locker search is not available and no caching was added but works.

require 'rubygems'
require 'sinatra'
require 'sevendigital'
require 'very_simple_cache.rb'
require 'json'

before do @api_client = Sevendigital::Client.new(:oauth_consumer_key =>’x’, :oauth_consumer_secret => 'x', :lazy_load? => true, :country => ‘GB’, :cache => VerySimpleCache.new, :verbose => "verbose" )

end

post '/authorise' do content_type :json

email = params["email"] password = params["password"] search = params["search"]

user = @api_client.user.authenticate(email,password) #not good, should cache more locker = user.get_locker() #also caching missing

valid_results = brute_force_search_on_string_compare(user, locker.locker_releases, search)

valid_results.to_json end

Introducing “Devs in the ‘ditch”

Posted in Agile, Announcements, Community Events on March 19th, 2012 by chrisodell – Be the first to comment

At 7digital we’ve always loved and appreciated the London developer community. All members of our development team regularly attend events held by groups such as the London .Net User Group, London Software Craftsmanship Community, Agile Evangelists, Extreme Tuesday Club, the various “In The Brain” events at Skillsmatter and many others.

We feel it is now our turn to give back to the community and as such we will be hosting free independent, semi-regular evening events at our office, called Devs in the ‘ditch, and we’d like everyone to come along. We’ve set up a meetup group for future events: http://www.meetup.com/devs-in-the-ditch/

We’re planning to host a few talks from members of the team, many of which have held the same or similar talks at other events including the QCon London and Software Craftsmanship and also hosting other members of the community.

Our first event is on Thursday 19th April and will feature Paul Shannon (@BlueRezz), who will be running a clinic on Code Smells:

Computer Scientists and Software Developers love analogies. You’ll find that the Agile fraternity love metaphors and double meanings more than any other breed of software developer. A Code Smell is one of the most obvious metaphors and comes from the instinctive reaction to bad smelling food – a primordial reaction to tell you that the food is bad, of low quality and should not be eaten. While smelly code might not poison you it can turn the stomach once you’ve honed your software craftsmanship skills. We’ll begin the session with an overview of some of the more common code smells. Bring your argument hats, as some of these points are going to be controversial.

Paul premiered his talk at XP Manchester early last year and also performed it at Agile Staffordshire where they found that the Code Smell Clinic was a good idea – like group therapy/free consultancy

There will also be drinks and after the talk everyone is welcome to join us at the pub round the corner. Plus, a free t-shirt for everyone!

Please note that entry will be via the rear door of the office situated in Clifton Street. There will be signage, so you should be able to find it.

Thursday 19th April
Doors open at 6pm to start at 6.30pm

Sign up on the Eventbrite page here.

Sundown in Solr Town – The Indexing Shootout

Posted in Solr on March 9th, 2012 by nicktune – 1 Comment

At 7digital we have over 33million records of musical data currently indexed our Apache Solr installation, powering our search API. Well done if you guessed this results in a constant stream of new data in need of indexing for client consumption…..And congratulations, you’ve found a bottleneck preventing us from making that data available for consumption as rapidly as possible.

Xml is the current format we send data to Solr to be indexed in, but as Web APIs did previously –- we’re going to challenge its verboseness as we ruthlessly hunt down performance – with a sundown shootout.

Meet the contenders:   Xml,  Json, and the humble Csv.

Procedure

We are going to test set sizes of 50, 500, 1000, and 2000 records. For each we are going to generate C# objects filled with dummy data, and then serialize them to each format. Here is the data-generation code:

public IEnumerable<Track> GetTracks(int setSize)
 {
         for (int i = 0; i < setSize; i++)
         {
                 var t = "Shootout is at sundown. Everybody knows Xml is going to lose..."
                 + "But who will win... Csv or Json?";        
                 yield return new Track
                          {
                                          id = (1000 +i).ToString(),
                                          edgengramtext= "BlahBlah dkjfl dkjdfd kdjf"
                                          text = t,
                                          trackDuration = "3.15",
                                          trackFormatIds = "1",
                                          trackISRC = (2000 + i).ToString(),
                                          trackNumber = "1",
                                          trackRank = "1",
                                          trackShopId = "34",
                                          trackTitle = "Track " + i,
                                          trackUrl = "http://track.track.com"
                           };
            }
 }

Once serialized into the correct format, we will send it in an update request to Solr. Our metrics will be the QTime (Query time) returned by Solr (how long it took the query to execute on the server), and the round-trip duration – how long it takes from sending the request to receiving the response.

Our pistol-slinging winner will be the best performer in the top bracket (2000 records), but we’ll also look at the overall means and medians in search of some quirky statistics.

Limiting Confounding Variables

Every update request to Solr will be preceded and followed by a request to delete all records – ensuring all updates are performed on an empty box. We’’ll also need to ensure every request to Solr – updates and deletes – are given time to be processed (caches reset etc) so there is no overlap of operations. We looked at the Solr book we’ve got and decided 2 second sleeps would be adequate to cover that.

We also have multiple cores set up on the Solr box. We don’t know how this would affect uploading in different formats, but it’s an independent-variable so we’ will put it out there. And we definitely made sure no other load was applied to the box.

Benchmark Caveats

Our test data may serialize better or be more easily indexed for any of the formats. We wi’ll accept the limitations for now and recreate the findings with more diverse, real data one day if the results are encouraging.

Solr also has configuration operations that relate to performance. We will only test one possible configuration here and we don’t know what effects they could have on the relative performances.

Go for Your Guns

Xml Shot Down By Csv Kid

With an emphatic thrashing, Xml – the un-cool cousin of Json for Web APIs –, has once again been written off due to its verboseness – this time when updating a Solr index…… Csv has the fastest trigger-finger in this town, though.

Give me teh codez

Not So Fast Redneck

Two other potential hard-hitters are on the block: remote csv upload and the data import handler.  We didn’t look at those here, but we won’t rule them out just yet.