Another Developer Blog

REST in Practice Tutorial

I was fortunate enough to attend a REST in Practice tutorial run by Jim Webber yesterday. The material was taken from the recently published book of the same name co-authored by Savas Parastatidis and Ian Robinson.

As you might expect, the tutorial followed the contents of the book closely and used the Richardson model as it’s outline. The tone for the tutorial was set with a discussion about the state of most enterprise architectures today, covering the Richardson level 0 stuff. The conclusion was that most enterprise architectures go out of their way to hide the network from us, which has resulted in some very complex solutions to inherently difficult problems (scalability, distributed computing, error handling etc.). The result being even more complexity.

It turns out that Tim Berners-Lee solved a lot of these problems over 20 years ago when he designed the foundations of the web. Jim continued to explain the web as a platform, introducing resources, HTTP verbs and hypermedia/HATEOAS as we worked through Richardson levels 1-3. Along the way we covered content types, response codes, scalability, caching, ATOM and security.

I still haven’t read REST In Practice yet (it’s top of my reading list as I write this), but got a lot out of the tutorial – a lot of practical advice which I can take away and use immediately. Jim was as entertaining as always, and a great presenter (although a little disappointed at times when checking the cricket scores).

Jim’s running the tutorial with Ian as part of YOW!2010 this week in Australia, and there have been other dates globally. I’d highly recommend this tutorial for anyone working on distributed systems, in the enterprise or otherwise.

Pragmatic Pair Programming

The reasons for pairing pair programming are well documented, and while I generally believe it to be a good way to develop software, I don’t believe in a one size fits all approach to anything.

One of the main arguments for pair programming is knowledge sharing. Frequently rotating people across different stories gives them broader exposure to the system than they may otherwise receive. This benefits the team as everyone becomes familiar with all aspects of the system and anyone can support any part of the system as needed. That’s the theory, anyway.

I’ve been thinking recently that we can achieve the same result providing we are able to consistently produce small stories or tasks which take no more than a couple of days to complete. This gives us the the benefits of rotation as individual developers are able to move on to new stories at a quick and consistent pace. Small, consistently sized stories also have the benefit of helping the team to achieve flow. By that, I mean features can be pulled through the development, testing and sign off process at a steady pace, and bottlenecks in this system are avoided.

Another argument for pair programming I’ve been considering is that it produces better design and makes tough problems easier to solve. After all, two minds are better than one, aren’t they?

While many developers may be able to solve the vast majority of problems they face, as they become more experienced they learn to recognise good solutions and bad ones, and everything in between. I think a mature developer should be disciplined enough to recognise when they don’t know the best solution to a particular problem without the need for pairing.

A less mature developer may go ahead and implement the bad solution, and get the job done albeit in a way that is likely to have longer term negative consequences for the project. A good developer will stop to informally run through their thought process with a colleague, or gather everyone around a whiteboard to discuss options. This again helps with knowledge sharing within the team.

While I think pairing is beneficial, I take a pragmatic approach to doing it. Sometimes a team can be more productive without pairing, and at other times paring can make a team more productive. I don’t believe that pairing needs to be used by all members of a team all the time. If a couple of people decide to pair, that doesn’t mean everyone else has to. Making rules like this can actually have a negative effect on the team.

It’s not that some developers are above pairing, while others need supervision. It’s about being able to identify the best approach to the task in hand, and being able to adapt as necessary.

Anti Patterns of Enterprise Application Architecture: The Database Application

Most large enterprises enforce a gated deployment process to take an application from development to production. That’s to say that an application must pass through a number of different environments before it gets signed off to enter the production environment.

For example, an application developed on a desktop machine must first be deployed to a QA environment, then a UAT environment and maybe a Performance Testing environment before finally being given the OK to move to production, for the whole world to see.

This process usually requires sign off from at least one manager, probably more. Coupled with the fact that corporate IT departments are often divided into areas of functional responsibility (think database, application server, security etc.), and moving the various components of an application through each environment requires the co-ordination of several different teams. The result is that taking an application to production is not a quick process, and usually ends up taking several days, if not weeks.

Business sponsors of IT projects end up feeling frustrated by how long it takes to make even minor changes to the applications that they pay for, and developers are left feeling like they’re letting their sponsors down.

The path of least resistance to deploying anything to the production environment often emerges as the database. After all, that’s only a single layer of your application, right? So, the process of moving database changes to production avoids the need to co-ordinate several different teams and deal with complications arising from error prone application server deployments, and other potential points of failure.

The path of least resistance to deploying anything to the production environment often emerges as the database. After all, that’s only a single layer of your application, right? So, the process of moving database changes to production avoids the need to co-ordinate several different teams and deal with complications arising from error prone application server deployments, and other potential points of failure.

What people often over look is that the more and more configuration you push into the database, the more and more the application you’re developing starts to resemble an interpreter. It becomes possible to configure virtually every feature your application has to offer from the database. At this point your application starts becoming the equivalent of the JVM or .Net run-time, and the database is really just a glorified source code repository playing host to the code that your application interprets at run-time.

Take it a step further, and why not move some of the logic behind your application into the database. It’s amazing what you can do with a few thousands of lines of stored procedure code!

But have you ever tried to debug several thousand lines of stored procedure? I have, and I blame my thinning hairline on the experience.

And testing? Well, that just seems to get slowly forgotten. Given that we already know it’s so easy to make changes to the production database, so what if things go wrong in production. It’s easy enough to correct mistakes by pushing out another untested database change straight to production. A few hours (or maybe days) of downtime isn’t that bad, is it?

We’re all in agreement that separating application logic from configuration is a good thing that allows us to re-use code we’ve written, but let’s not take things too far. Of course, the ideal solution would be to change the process so that moving an application to production is quicker, but we’re rarely in a position to do this. Instead of trying to work around the corporate red tape that makes it difficult to deploy changes to production, the best we can do in this situation is to oil the gears that need to turn to get things done.

That faceless application server support guy you always email to get your application deployed to QA? Why not take him out to lunch and get to know him? The security guy you always email to get ports opened for your application to function correctly? How was his weekend? And how are his kids going at school? Pick up the phone and find out. You’ll be amazed how much more quickly you can get your application moved through different environments and to production when you have a relationship with the people you need to make it happen. Instead of compromising the quality of your application by pushing everything to the database, why not give it a try?

Build Time Monitoring With Cruise

We’ve been using Cruise to handle continuous integration duties on my current project for several months now. One thing that Cruise doesn’t support out of the box is any way to easily monitor the trend in build duration over time. While we as developers get a pretty good feel for the build getting slower and slower, we don’t have any real statistics that we can use as evidence to back this up.

I feel it’s very important to have a quick build, and so having visibility over build times is essential. If we’re going to make any attempt to speed up a slow build, how can we measure any improvement we make without having easy access to these statistics?

One of Cruise’s strongest features is its RESTful API which gives us the ability to integrate with it. Cruise allows access to information about past builds, including the duration which the build took to run, through its Properties API. This API provides access to build data in CSV format and the fact that it’s all accessible through a simple HTTP client means that it’s really easy to consume and use as you wish.

I’ve been a big fan of jQuery for some time, and I recently stumbled across Flot, which allows you to plot various different kinds of chart directly in Javascript. I’ve combined the two, with some help from the excellent Datejs library to consume the data from the Cruise API and plot the results.

The build for our main application is divided into six stages within a single pipeline. I’ve chosen to plot each stage as a separate series on the same chart, the y-axis shows the build time in minutes and seconds, and the x-axis the date on which the build took place. I’m not going to focus on we can learn form this chart in this post, but instead on how I created the chart itself. Here’s the Javascript I ended up with.

I use a simple data structure to configure the URL to the Cruise properties API which will serve up the CSV (I’ve masked it in the code example), and the label which will be used to identify the series in the chart legend. I’ve only shown a single series in the code, but any number could be added to the array. Then its a case of iterating over each of the builds, using jQuery to make an Ajax call to get the CSV and parsing the result.

You’ll see I use Datejs to parse the timestamp indicating when the build took place. Cruise returns the timestamp in the format “2009-11-16 12:04:19 +1100”, which Datejs handles with ease.

You’ll also notice I only plot successful builds. I included all builds originally, but the charts became too noisy, and it made seeing the overall trend of build times difficult when failed builds were also shown.

Finally, I make use of the ajaxStop jQuery event, which is triggered when all Ajax requests are complete, at which point I plot all of the results using Flot. I’ve used a custom formatter for the y-axis, which uses Timejs (part of Datejs), to convert the build time which Cruise supplies as a number of seconds, to a more readable minutes and seconds format. I was surprised how much of a performance hit I got when rendering the chart with the custom formatter, but haven’t looked through the Timejs code to work out exactly why this is.

Within the body of my HTML page, I have a div which the result is rendered into. You’ll see it’s easy to tweak the size of the chart and other aspects to fit your requirements.

I see this kind of chart working well as an Information Radiator, giving visibilty over the trend in build time to the whole team. It would be easy to integrate this as part of your continuous integration process, and have the chart automatically redraw periodically to keep up to date with new builds. Utimately, this kind of information can be used to highlight increasing inefficiencies development teams, and their customers feel as builds get slower and to measure improvements as we work towards making the build faster.