Over the last month we’ve started using ServiceStack for a couple of our api endpoints (go to the full ServiceStack story here) . We’re hosting these projects on a Debian Squeeze vm using nginx and Mono.
We ran into various problems along the way which we’ll explain, but we also managed to achieve some interesting things; here’s a summary. Hopefully you’ll find this useful.
We’re using nginx and fastcgi to host the application. This is good from a systems perspective because our applications can run without root privileges.
For the communication between mono-fastcgi and nginx, we are using a unix socket file instead of proxying through a local port. This makes configuration much easier, as you map applications to files rather than port numbers, so the convention rules for this are much more straightforward. (Besides, you may be hit by a memory leak if you don’t use unix socket files.)
Furthermore, using files instead of ports has made our life easier for automated deployments because:
- We can just specify the socket file based on the incoming host header. This way you only need one app configuration in nginx for N sites. The key here is to use the variable $http_host in the fastcgi_pass setting of our nginx config file. Example:
- We can decouple deploy from release, and the latter will be an instant operation that doesn’t require a restart of the app or the webserver. The deployment procedure is therefore as follows:
- Copy new release to servers.
- Start app server for new release listening on a different unix socket file than the current working release. i.e. /tmp/SOCK-myFqdn-newrelease. (We use nohup for this.)
- Do some test requests on the new release to verify that it works, using a custom host header that matches the unix socket file pattern used in step 2.
- Make a hard link (not symlink) on the current release to have a backup.
- Move the unix socket file of the new release into the the current release: i.e. “mv /tmp/SOCK-myFqdn-newrelease /tmp/SOCK-myFqdn”.
- Kill the old release.
The key step is (5) here. By moving the new release into the old release, without moving or removing the old release socket file before, we manage to decouple deployment operation from release, with an “atomic” mechanism which gives us zero downtime (we tested this with siege, a very interesting load testing tool, configured with 100 concurrent connections), without the need of adjusting load balancers as part of the deployment.
We’re still studying the best way to do step (6). For now, we just wait briefly to be sure the app has finished serving/processing the last requests. The problem with this approach is that we need to record the PID of the app to be able to kill it later (because lsof hasn’t worked in our scenario, after much testing, either after step 5, using the hard link, or before step 4, using the normal file; so we need to use pgrep after step 1 for now). We’re studying stopping the application via a final HTTP request (with the DELETE verb on the status endpoint maybe? sounds about right :) ), but calling Environment.Exit() from a web application doesn’t look right.
So, to be able to deploy to many servers at once, we use capistrano to launch the ruby scripts that control the 1-6 steps mentioned above. Another interesting thing we do with capistrano is configuration.
Many .NET developer teams are in the habit of managing configuration files at the app level, naming them with extensions that map to the environment. This is pretty handy because as a developer you have all the configuration values of all the environments right there in your IDE, in case you want to explore something.
But it has a big drawback: any configuration changes need to be done in the same repository as the code, which has the following consequences:
- Configuration changes end up in the long chain of continuous integration; you have to wait for building and testing infrastructure for your change to go through and be able to deploy it.
- Configuration ends up being mainly a developer-task. Sure, DevOps techniques now encourage this by trying to involve developers more in system administration and configuration, but in this case it is a drawback because it renders the systems team incapable of doing configuration changes because they are not normally familiar (and aren’t supposed to) know what’s the src/ sub-folder structure to locate the config files, and they don’t typically use the CI tool to see whether their configuration change has affected some test-suite.
- Sometimes configuration files end up being almost exactly the same between environments, only differentiated by one or two values. This makes locating errors in the config files a daunting task.
So we decided to implement the configuration at the deployment level: There is a base configuration file in the sources which contains default values. In a different repository, for each environment there is a ruby configuration file that only has the values that need to change, i.e.: production.rb, uat.rb, etc. This file simply contains variables whose name is the appSetting key and value needs to be replaced before deployment, or a hash map in which each key is an XPath expression to look for the proper node in the XML config file. Example:
This way we decouple implementation from configuration completely. If there are red builds all around your CI environment, you don’t need to do fancy things with stable branches to do configuration changes in production. Systems team could scale the application by adding new servers, without the developers knowing about it…
We’re using version 2.11.3. There are various bug fixes compared to the version that comes with Squeeze; namely, problems with min pool size specified in a connection string. Rule of thumb, if there’s a bug in mono then try the latest!
For build agents we use an even higher version, 2.11.5 (which has not been tagged yet and to get it, you need to clone the master branch), to be able to use xbuild with F#. There were some issues that we fixed to make this work both in the Mono repository and in the fsharp repository. (We found Mono to be faster than .NET when compiling builds and running tests!)
NB: Our systems team has helped a lot in ironing out the kinks of all this, kudos to them!