Welcome to Blogs From The Geeks, Intermittent insightful nuggets.


API

Metric Driven Development Fueled by StatsD and Graphite

Posted in API, Agile, Innovation, Metrics, Testing on April 18th, 2012 by goncalopereira – Be the first to comment

Why metrics?

Since I joined 7digital I’ve seen the API grow from a brand new feature side by side with the (then abundant) websites to be the main focus of the company. The traffic grew and grew and keeps on growing in an accelerated pace and that brings us new challenges.

We’ve brought the agile perspective into play which has made us adapt faster and make fewer errors but:

  • We can do unit tests but they don’t bring out the behaviour.
  • We can do integration tests but they won’t show the whole flow.
  • We can do smoke tests but they won’t show us realistic usage.
  • We can do load test but they won’t have realistic weighting.

Even when we do acceptance criteria we are actually being driven by assumptions, even with an experienced developer he is really just sampling all his previous work and as we move to a larger number of servers and applications it’s not humanly possible to take all variables into consideration.

It is common to hear statements like ‘keep an eye on the error log/server log/payments log when releasing this new feature’ but when something breaks it’s all about ‘what was released/when was it released/is it a specific server?’. As the data grows it becomes harder to sample and deduce from it quickly enough to feedback without causing issues, especially when agile tends to implement intermediary solutions which might have different behaviours from the final solution that have not been studied.

The truth is that nothing replaces real life data and statistics – including developers opinions – if it the issue is a black swan then we need to churn out usable information fast!

Taken from @gregyoung

This has been seen before by other companies; for example, Flickr on their Counting and Timing blog post. See also Building Scalable Websites by Flickr’s Cal Henderson.

This advice has been followed by other companies like Etsy on their Measure Anything Measure Everything blog post or Shopify on their StatsD blog post.

How to do it?

Decided to start with a winning horse I picked up the tools used by these companies:

StatsD is described as “a network daemon for aggregating statistics (counters and timers), rolling them up, then sending them to graphite”.

Graphite is described as “a highly scalable real-time graphing system. As a user, you write an application that collects numeric time-series data that you are interested i[...]. The data can then be visualized through graphite’s web interfaces.”

The way to implement these is available in several tutorials and I used StatsD own example C# client to poll our own API request log for API users, endpoints used, caching and errors.

In the future it would be ideal for the application to access StatsD itself instead of running a polling daemon.

There are a lot of usable features on Graphite. The ones I’ve used so far include Moving Average which will smooth out spikes in the graphs making it easier to see behaviour trends in a short time range and Sort by Maxima.

There are even tools to forecast future behaviour and growth using Holt Winters Forecasting Statistics and this is used by companies to understand future scalability and performance requirements based on data from previous weeks, months or years (seen in this Etsy presentation on Metrics)

How it looks and some findings

Right away I got some usable results. An API client had a bug in their implementation which meant they required a specific endpoint more often than they would use it – this data can help out with debugging and also prevent abuse.

Sampled and smoothed usage per endpoint per API user…

Another useful graph is error rates, which might be linked with abuse, deploying new features or other causes.

Error chart smoothed with a few spikes but even those are on the 0.001 % rate

Here is some useful caching information per endpoint to know how to tune up TTLs or look for stampede behaviour.

Sampled and smoothed Cache Miss per Endpoint

Opinion

After you start using live data to provide feedback for your work there is no going back. It is my opinion that analysis of short and long term live results of any type of work should be mandatory as we move out of an environment that is small enough to be maintained exclusively by a team’s knowledge.

7digital Tomahawk Resolvers

Posted in API, Community Events, REST, Ruby, community on March 28th, 2012 by goncalopereira – Be the first to comment

Went to Amsterdam Music Hack Day and left with a working Tomahawk Resolver for 7digital previews, a spiked locker integration and some cool ideas on how to promote 7digital with Open Source

This will allow Tomahawk to use 7digital’s track search and listen to previews, in the future it will integrate buy buttons and help out Tomahawk to work with 7digital :-)

The locker integration is spiked using a local service which would be provided/maintained by either 7digital or a third party so it won’t be available immediately after the demo as there are feature, performance and security concerns.

What was used

http://developer.7digital.net/ 7digital API with the demo musichackday API client key

https://github.com/tomahawk-player Tomahawk resolver examples and documentation

API locker service built in Ruby with Sinatra and the 7digital ruby API client gem. Also uses the JSON gem.

Also built some local stubbed API responses using Ruby Sinatra as the connection was slow/failed sometimes :-)

Links

Don’t judge me on my JS..

https://github.com/goncalopereira/7digital-tomahawk-resolver working search with previews with musichackday key

https://github.com/goncalopereira/7digital-stubs stubs…

Service example code

This will authorise the user with a premium api account and get the locker, it is a slow implementation as locker search is not available and no caching was added but works.

require 'rubygems'
require 'sinatra'
require 'sevendigital'
require 'very_simple_cache.rb'
require 'json'

before do @api_client = Sevendigital::Client.new(:oauth_consumer_key =>’x’, :oauth_consumer_secret => 'x', :lazy_load? => true, :country => ‘GB’, :cache => VerySimpleCache.new, :verbose => "verbose" )

end

post '/authorise' do content_type :json

email = params["email"] password = params["password"] search = params["search"]

user = @api_client.user.authenticate(email,password) #not good, should cache more locker = user.get_locker() #also caching missing

valid_results = brute_force_search_on_string_compare(user, locker.locker_releases, search)

valid_results.to_json end

Change to Default Artist Image

Posted in API, Announcements on February 20th, 2012 by Nick Skelton – 2 Comments

We will be making a change to the default image we serve when we do not have an a proper artist image available.

Currently we redirect requests, via an HTTP 302, to the following image: http://cdn.7static.com/static/img/artistimages/00/000/000/0000000000_200.jpg

This is due to change to: http://cdn.7static.com/static/img/artistimages/00/000/000/_defaultartist_286.png

No other change to behaviour is planned and we do not expect this to affect any API consumers or client applications. However, we advise checking that this won’t cause a breaking change in your application and amend as necessary. We recommend not relying on this URL in any manner, and simply check for the 302 redirect if you want to ascertain whether a non-default artist image is available.

Planned change date is Monday 27th February.

Getting started with web applications on Mono

Posted in API, Development, How To, OpenRasta on February 17th, 2012 by Hibri Marzook – Be the first to comment

 

I’ve started to explore mono, with a view to moving some of our web applications to Linux. Used MonoDevelop  on OSX to  spike a simple HttpHandler to return a response.  I was more interested in how the hosting and deployment story worked with mono.

This is a little list of things I discovered as I went along.

http://www.mono-project.com/ASP.NET has  list of the hosting options available.  Went with the Nginx option.  Mono comes with xsp, which is useful for local testing.

Running a simple web application

To run xsp  /usr/bin/xsp –port 9090 –root  <path to your application>, and the application will be available on http://localhost:9090

 

To install Nginx on OSX,  get Homebrew.  And then simply  sudo brew install nginix

Follow the instructions here http://www.mono-project.com/FastCGI_Nginx to configure Nginix to work with Mono’s FastCGI server.

On OSX, the Nginix configs can be found in /usr/local/etc/nginx/nginx.conf

This is the configuration I tried for my testing,

In /usr/local/etc/nginx/nginx.conf

server{
   listen 80;
   server_name localhost;
   access_log /var/log/nginx/localhost_mono_access.log;
   location / {
        root /Users/hibri/Projects/WebApp/;
        index default.aspx index.html;
        fastcgi_index default.aspx;
        fastcgi_pass 127.0.0.1:9000;
        include /usr/local/etc/nginx/fastcgi_params;
   }
}

Add the following lines to /usr/local/etc/nginx/fastcgi_params

 fastcgi_param  PATH_INFO          "";
 fastcgi_param  SCRIPT_FILENAME    $document_root$fastcgi_script_name;

 

Start Nginx.

Start the Mono FastCGI server

fastcgi-mono-server2
     /applications=localhost:/:/Users/hibri/Projects/WebApp/ /socket=tcp:127.0.0.1:9000

And the application is available on http://localhost

Web Frameworks

We use OpenRasta for the services I want to run on Linux. OR didn’t work out of the box. This is something I’ll be exploring in the next few days.

Tried ServiceStack too, and was able to get one our projects (https://github.com/gregsochanik/basic-servicestack-catalogue) working on Mono as is.  Nancy is next on the list.

Search

Posted in API, Search, Solr, SolrNet on October 13th, 2011 by Mark Unsworth – 2 Comments

We will be the first to admit that our search has been far from optimal for some time, it’s something that’s frustrated us as much as it has our users. Unfortunately the unprecedented growth of 7digital has taken its toll on the original search infrastructure that powered our platform for the last 7 years – that’s right we were 7 this year.

A few weeks back we quietly made some changes to the artist and release search. These changes have been in the works for several months and has improved the quality of our search results as well as the speed in which those results are returned. Alongside the improvements to quality and speed we are also now returning accurate pricing for releases across all of of our catalogue, something that we haven’t been able to do previously.

Architecture

The main reason for the improvements has been our move away from using SQL Server Full Text search to using the open source Solr search platform.  Solr is a super fast open source search server built on top of the Lucene search library. We’re also using SolrNet to be able to index and query Solr from our .NET codebase – more on our SolrNet usage here.

We have a master-slave set up where we index all of our documents (~40m) to a single write-only master. This is then set to replicate out to a number of read-only slaves. We aren’t currently sharding the data across the slaves so they are exact mirrors with HAProxy in front of them to balance the load.

Load Balancing

We originally went with a round-robin approach to load distribution but realised that we were potentially caching the same query on each of the slaves so used the balance url_param feature of HAProxy. This means that the same query is always requested from the same slave. Average query times were reduced by 50% from this change alone. The graph below shows the avg response time dropping off and stabilising once the change had been made.

Improving Quality

We haven’t had to do a great deal to improve the relevance of the results returned as Solr gives you this for free, but we have been investing time in looking at the ways our users are searching and seeing what we’re missing from our index. Better logging of search requests should allow us to be able to understand more about where customers are not finding what they are looking for. We’ll blog more about this when we start work on it.

Speed Improvements

Our average search response times are now currently less than 200ms. This is a significant improvement from the days of SQL Server Full Text search when the average query time via our API was around 2 seconds and also on the initial implementation of search on top Solr which had average query times of around 500ms.

The image below, taken from our New Relic dashboard for our Search API, shows the last months stats for the Search API. The left hand chart shows the average reponse time (lower is better), the top right shows the Apdex (performance) score (higher is better) and the bottom right shows the amount of requests per minute we are seeing.

API Search Traffic 4/9 - 4/10

To put this into perspective, if you search for ‘Lady Antebellum’ on Google it takes around 200 milliseconds, but through our API it only takes 58ms  - ok so Google do return a result set of 54 million pages but they don’t show our artist page at the top!

Future Plans

We will be making more improvements to search over the coming weeks and months including a long awaited update to the track based search.

My First Fortnight – Switching Agile Teams

Posted in API, Agile, Development on September 14th, 2011 by Paul Shannon – Be the first to comment

I’ve always encouraged new starters in my previous team to write a post summarising their first impressions, so after starting at 7Digital 2 weeks ago I thought I’d do the same. While the aforementioned new starters are usually fresh faced graduates, I’m more a lived-in, agile, worked-at-the-coal-face, TDD obsessed, software craftsman who played a major part in bringing agile principles and practices to my former team. I’ve joined the services team, working on the 7Digital API, the Asset Server and Accounts web app.

First Impressions

My first impressions of the new team, working environment and codebase are positive, with similarities to how I am used to working. One major difference I noticed was the number of tests – I’m used to having the odd regression level test backed up by mainly unit tests with fewer integration tests in favour of collaboration tests using mock objects. The approach here suits the products well and is fantastic for a new starter, as you can safely refactor or transform most of the code with confidence.

Outside In

In my first two weeks the majority of tasks I’ve been working on are around configurations, and higher level tests (acceptance and smoke tests). This has simply been down to the nature of work at the moment but has served me well in orienting myself with the systems. I was paired on  a task to create a diagram of dependencies for services in our team. This helped me understand the detail of the SOA environment that the team works with. It also provided a useful resource for the rest of the team so I was pleased that I was able to add value early on. Working on the acceptance and smoke tests was useful because of the old adage of “tests as documentation”. The behaviour style naming and separation of actions using cucumber syntax meant that I could understand the tests, and the API it was testing, quickly without too many questions. The simple language meant I could see what the system is supposed to do before I am fully immersed in the team’s domain.

Code Quality – Team Goals

I’ve seen a mixture of code quality because of the variety in levels of legacy code the company has (or “legacy, legacy code” as Rob puts it) but I’m pleased that there is a whole team mentality to improve the quality of any code we touch. It was a pleasant surprise when my suggestion of encapsulating a primitive Dictionary as a class with a single responsibility and internal state was met with enthusiasm rather than resistance. I hope my experience and attitude for improving code quality can help the team meet their goals, as I think I’ll fit in well here.

Friday Deployments

There is a strict rule here that deployments on a Friday are only for exceptional circumstances. I like this policy, I had it in my last job where it was colloquially known as Shannon’s Second Law (Shannon’s First Law being attributed to Claude Shannon and describes maximum transmission bandwidth for a channel with noise).  I was surprised to see an emergency deployment on my first Friday though, and thought I should record it here for posterity ;)

Sustainable Pace

The pace here is a little slower than I’m used to, yet we seem to be adding a great deal more value. It makes the working environment relaxed which helps to maintain focus when you need to. The frequency of commits to the master branch, continuous integration runs and high quality code mean that this practise really works well here, I’m just trying not to be too eager and adjust to the natural flow.

Friendly Folk

There is a very social atmosphere amongst the teams here, from the sharing lunch around the sofas, to getting a beer in on a Friday. Even the odd flame war is done in good humour. It makes a particular daunting time far easier for new starters.

I’ll probably add some posts in the future on my experiences here, along with some more technically insightful nuggets – for now though you could always follow me @BlueReZZ

Improving API customer support

Posted in API on July 29th, 2011 by wack – Be the first to comment

As software developers, our role is to provide solutions to our customers needs. Sometimes it is by writing new code, new features and innovative products. But often it is also in the role of developer support; answering customer queries, triaging bug reports etc. We spend a signficant amount of our time every day in the latter of these roles, and are constantly looking for ways to improve our work in this regard (as we do with all of our work).

So, in an effort to improve the speed and quality of the support we supply to our customers, we have developed a support template, supplying the information that we feel is necessary to effectively handle API support queries.

Ideally, we now require:

  • The full raw HTTP request/response logs (in sequence) including:
    • Request URL
    • Request headers
    • Request body
    • Response headers
    • Response body
  • Screenshots if available (optional)
  • Any steps needed to reproduce the bug (especially for devices)

Any mature application framework (ASP.NET, J2EE, Rails etc) should be able to provide this information. We also use an open-source tool called cURL, available pre-installed on Linux and OSX and for Windows via download (http://curl.haxx.se/) which can provide this information via the command line.

To use cURL (from a command prompt):

curl -v “http://api.7digital.com/1.2/release/details?releaseid=155408&oauth_consumer_key=YOUR_KEY_HERE” -H “accept:application/xml” -X GET -o output.txt

…where -o is the output file, -H supplies your HTTP headers, -X is your HTTP verb and -d supplies data (for POST requests).

Sample cURL Output:

Request:

* About to connect() to api.7digital.com port 80 (#0)
*   Trying 94.127.74.186... connected
* Connected to api.7digital.com (94.127.74.186) port 80 (#0)
> GET /1.2/release/details?releaseid=155408&oauth_consumer_key=YOUR_KEY_HERE HTTP/1.1
> User-Agent: curl/7.21.1 (i686-pc-mingw32) libcurl/7.21.1 OpenSSL/0.9.8k zlib/1.2.3
> Host: api.7digital.com
> accept:application/xml
>

Response:

< HTTP/1.1 200 OK
< Server: nginx/0.7.67
< Date: Fri, 29 Jul 2011 09:31:07 GMT
< Content-Type: text/xml; charset=utf-8
< Last-Modified: Fri, 29 Jul 2011 09:30:29 GMT
< X-AspNet-Version: 2.0.50727
< X-RateLimit-Current: 68
< X-RateLimit-Limit: 4000
< X-RateLimit-Reset: 52132
< Set-Cookie: SevenDigital.Web.Session=sid2=3fc9d66d-38a6-40dc-8ff2-f159355ba49c; domain=.7digital.com; path=/; HttpOnly
< x-7dig: aw0
< Accept-Ranges: bytes
< Cache-Control: private, max-age=120
< Age: 0
< Expires: Fri, 29 Jul 2011 09:33:07 GMT
< x-cdn: Served by WebAcceleration
< Transfer-Encoding: chunked
< Connection: Keep-Alive
<
<?xml version="1.0" encoding="utf-8" ?><response status="ok" version="1.2"><release id="155408"><title>Dreams</title><version>UK</version><type>Album</type><barcode>00602517512078</barcode><year>2007</year><explicitContent>false</explicitContent><
... body omitted for brevity...
</response>
* Closing connection #0

For OAuth requests, we also provide a WinForms application here, that will allow you to construct your own signed API requests.  Currently in active development, it is something we use every day here at 7digital, and will be constantly improved over time.

Planned Maintenance – Monday 27/6/2011

Posted in API, Announcements on June 23rd, 2011 by Hibri Marzook – Be the first to comment

We will be performing an essential upgrade of our DB infrastructure on Monday June 27th between 4pm GMT for a few minutes. This will affect the whole 7digital Platform and our services might be unavailable during this period. Whilst the upgrade is taking place holding pages will be served by our consumer websites and the 7digital API will be returning the following error response:

<response status="error" version="1.2"> 
  <error code="7001"> 
    <errorMessage>7digital API is currently down for maintenance</errorMessage> 
  </error> 
</response>

Apologies for any inconvenience caused.

Best practices for working with the 7digital API

Posted in API, FAQ, How To on May 19th, 2011 by Hibri Marzook – Be the first to comment

Use of the 7digital API has grown, and we are seeing a lot of innovative applications and websites using the API. We are constantly adding new things to the API, and doing a lot of work to improve it. This means that the will continue to change and if you don’t follow the guidelines below it is likely that your applications consuming the 7digital API may break.

We’ve seen a few API implementations that don’t use the API in an optimal and tolerant manner. This article is to give some guidance to build a resilient application around the API.

The motivation for this post also comes from a recent article on consuming web services http://martinfowler.com/bliki/TolerantReader.html.

 

Reading responses.

When reading the response,  access elements by path and name, and use xpath if possible. Access elements by their hierarchy in the xml. Sometimes elements with the same name can appear in the same response under different parent elements. These can have different values depending on the parent element.

In your code that parses the xml, avoid accessing elements by index.  For example as in elements[0] , instead access elements by name e.g elements[“name”] .  If the order of xml elements  change, your code will still work. The order of elements can and will change at any time. Please do not depend on elements being in a certain order in the API response.

 

The Schema.

We’ll make every effort not to break the schema, but you shouldn’t rely on this. Read the elements you need from the response. It’s best to write code that has a graceful fall back if a certain element is not populated or is missing. A missing element should not crash your application/website. The schema is not a binding contract, but a document which describes the structure of the response. In programming terms, treating the API response as a dynamic type, and not as a static type is recommended.

If you come across unexpected responses or discrepancies between documentation and the actual API behaviour don’t just work around these. Please get in touch and let us know about the problem and we’ll do our best to fix the issue or advise you on the right way to go.

OAuth.

The OAuth libraries at http://code.google.com/p/oauth/ are recommended. These are used in our code and there is a reference implementation in .Net with the OAuthconsole https://github.com/7digital/OAuthConsole

You should not need to do the 3-legged OAuth process each time you need to access an endpoint, that requires 3-legged OAuth. 3-Legged OAuth sets up a trust relationship between you (the consumer) and a 7digital user, who authorises you to make requests on their behalf. This should be setup only once. The access token and secret obtained during this process can be re-used till the 7digital user revokes access.

Local caching.

We return cache headers. Make use of it. Your application will be more responsive if you don’t make unnecessary server calls.

Placing a proxy server such as Squid between your application and the API, will give you caching quite easily and send requests to the server only when the cached item has expired.

 

Ask us

We are on twitter @7digitalapi and on google groups at 7digital API Developers. We are here to help you with your implementation. We’ve got sample code on the  7digital github page to help you get started.

7digital vouchers now redeemable via the API – Important Basket API changes

Posted in API, Announcements on May 9th, 2011 by filip – Be the first to comment

We’re happy to announce that 7digital API now supports redemption of 7digital track and album vouchers.

To better accommodate for voucher discounts applied to baskets a couple of important changes have been made to the basket XML schema: 

In all Basket API responses we have introduced new XML element <amountDue> which will denote the final due price to be paid by the end user for given item or the whole basket respectively. The value of <amountDue> will reflect any discounts applied (at this point vouchers only) to the item or basket. The existing <price> element will remain part of the basket XML response for backward compatibility but will denote the standard catalogue price, i.e. it’s value will not change regardless of any discount/voucher applied.

Please also note that in order to keep down the size of response we’ve omitted the currency information from the <amountDue> element on basket item level. The currency information will still be available in <amaointDue> element on basket level. All items within basket will always have their prices in the same currency.

What does this mean to you?

  1. if you have an existing integration with 7digital Basket API and don’t plan to support 7digital vouchers you won’t be affected by these changes. But at your earliest convenience we encourage you to start using <amountDue> as the price displayed to the end users.
  2. if you have an existing integration with 7digital Basket API and would like to start accepting 7digital vouchers you will need to update your application to use <amountDue> as the price displayed to end users. If you wish to, you can use the <price> element to display the original price (if it differs from the amount due) as long as it’s clearly marked as the price before discount, e.g. by striking it through.
  3. if you’re building a new application using the 7digital Basket API you should use the new <amountDue> element to display prices to end users regardless of whether you plan to accept 7digital vouchers or not.

For further details and example XML responses please see the updated documentation for Basket API and the new basket/applyVoucher method.

If you have any questions please don’t hesitate to get in touch.

Thanks
The 7digital API Team