Wednesday, 2 December 2009

Testers Heads Up Display

In 2008 I was fortunate to be invited to the Google Test Automation Conference 2008 in Seattle, USA. One of the talks I was most looking forward to was a talk by James Whittaker on the Future of Testing.

During the talk, James Whittaker talked about the idea of a Testers Heads Up Display. He took the idea that while playing video games you always have some form of Heads Up display that gives you some advantage in the game. This can be your health status or a radar. What this Heads Up Display does is give you little bits of information that make the game more enjoyable. And as you get better with the game you rely more and more on this information.

In testing the concept of a Heads Up Display it means that you suddenly you have a huge amount of data in one go for testing your application. Our T.H.U.D. is just a layer that has been created on top of the eChannel application. It works by doing a number of REST type calls to different services and then gathers all this information into the extra layer so we can have a view of what is happening in the system.

At smartFOCUS DIGITAL the Tests Heads Up Display was first created to compliment the reporting of our Performance Reporting Project that was worked on by David Henderson and myself.

The T.H.U.D would show the the exact same data as our reporting portal except using sparklines and would not take up the entire screen to render. This allows us to view our site but with the added benefit now of being able to view our YSlow performance Data as shown below.



But the T.H.U.D still had not hit its potential for the needs of the testers. For it to be a decent Heads Up Display it needs to give you all the information that may be useful while testing.

The Editor became the next area to get the T.H.U.D. The Editor is an extremely complex sub-system of the smartFOCUS DIGITAL eChannel application because it needs to handle the numerous idiosyncrasies that each browser has in its rendering engine.

The T.H.U.D was expanded to show what was happening in the editor when using the editor in "Normal" mode. This would show the HTML that the browser was generating while the user was working on creatives. Below you can see a copy of our editor with the T.H.U.D showing what the HTML really looks like with links to create bug reports. This means that bugs can be raised a lot quicker to have quicker and better feedback loops.





A lot of this is possible with very little effort because our application lives within a browser. We are just creating a new layer on our application and extending it with REST type calls to different applications that may hold the external information. It has become an invaluable tool in our testing arsenal.


Wednesday, 11 November 2009

Performance Monitoring and Reporting - Our Story

This post summarises the presentation that we gave at the Google Test Automation Conference (GTAC) in November 2009. It describes some of the work that the smartFOCUS DIGITAL development team have been doing to both monitor and optimize the performance of our web applications.

The performance of a web application should be regarded as a feature rather than an after thought. More organisations are noticing that when their application has any form of unintended latency, it affects their profits. Google saw a 20% drop in traffic when they added a 500ms delay. Yahoo! saw nearly a 10% drop in full page traffic when their load time increased by 400ms. Amazon noticed a 1% drop is sales when their pages took 100ms longer to load.

The performance of an application forms a large part of the user experience and influences the user’s impression of the application and the company it represents. There are a number of reasons to automatically measure Key Performance Indicators (KPI) of a web application:

  • As a developer, you can gauge the impact of changes that you are committing. Cumulative feature additions and bug fixes can lead to a web application that "feels sluggish", but has no immediately obvious culprit unless a developer keeps on top of performance tuning. If you are able to continuously measure performance, you will be able to rectify issues as they arise (functioning as the performance equivalent of a Unit test in a typical Continuous Integration setup).

  • As a tester, you can free up testing resource to be used throughout a sprint. The data that is collected can also be added to the Testers Heads Up Display for performance results.

  • As a manager, you gain a top-down view of site performance, which allows you to become aware of performance issues before your customers inform you.

  • As a member of an infrastructure team, timing data gathered for a range of key actions on a site provides an insight into the performance and load on application and database servers. This should reflect the experience as seen by the application’s users and may pick up on issues that are not immediately obvious through traditional server monitoring tools.

  • As a member of a support team, historic performance data is useful as a comparison point when diagnosing user issues. The changes that are needed to correct performance issues are not necessarily hard to implement, but a few small changes throughout a site can produce huge returns.

We needed to create a framework to allow us to find the performance issues affecting our application, fix them and monitor them to ensure we don't regress.

We chose YSlow, a free firebug plugin from Yahoo! to perform the measurements and produce useful, consistent data.

YSlow

As well as producing detailed, static reports within the plugin, YSlow has the ability to "beacon" the data it records to a pre-set web address using an HTTP GET request. YSlow can also be set to auto-run on page load, so you can manually walk through a site and measure each page.

While doing this process we noticed that the a number of reporting aspects were missing. No detailed information about the make-up of the page (e.g. component types, caching information) was sent.

As YSlow is a standard Firefox plugin, it was easy to poke around in the source code and set this right. We unzipped the XPI file that contains YSlow and made the modifications to send the extra caching and component data. We then rezipped it and installed it on a newly created Firefox profile.

(Note: This was the case at the time - for YSlow v1 and the early betas of v2. The latest version has an extremely comprehensive and well documented beaconing system! See http://developer.yahoo.com/yslow/help/index.html#yslow_beacon )

Selenium

We then needed to look to at automating the process of walking through the site. Since smartFOCUS Digital uses Selenium for a lot of its automated testing it was a natural choice for us to use.

For each section of the walk through, a new requestID is retrieved from the database and this is then sent with the beacon data, to allow all the data we collect to be tied together.

Once the tests had been running for a while we noticed that the data that was being recorded by YSlow was not what we were expecting. To YSlow, it appeared that we were not implementing any caching at all. After a bit of hunting through the Selenium RC Code base, we found the following lines of code and commented them out.

They are within the proxy that sits between the site under test and the machine driving the tests. By default, the proxy blocks ETag and Last Modified headers from passing through, to ensure that the lastest version is always being tested. This is brilliant for testing of versionless software but if you are testing versioned software and want to check that the caching is working properly they can be rather annoying.


//response.removeField(HttpFields.__ETag); // possible cksum?  Stop caching... 
//response.removeField(HttpFields.__LastModified); // Stop caching


Action Timing

Once we had managed to get a what we wanted from running the tests with YSlow we wanted to know what other information would be useful. The first thing that came to mind,and that we have implemented, was the ability to measure how long things take to load. We started out by recording the time that it takes to load a page but we also started getting interested in recording how much time it would take for dialogs to load as well as tree nodes to expand in our management view on the site.

Reporting

So now we are recording all of this useful data, we need a good way to analyse it and be able to monitor each build that is produced. As we predominantly produce web applications, a web reporting portal was the obvious solution. We make heavy use of jQuery within our site, so we went with the really great reporting plugin, flot. The data is pulled from the database through a JSON webservice and goes to create two flot plots for the YSlow data for each page.

Size Plot

The size plot shows the size of the page as a whole, as well as the size of each type of component (e.g Javascript, CSS) that makes up the page. Along with this, the cached versions of these values are plotted as well (i.e. the size you'd download if you visited with a full cache).

The build numbers are plotted along the x-axis: vertical bars highlight when each new build was introduced. The darker bars represent minor build number changes (e.g. 1.2.0) and the lighter bars represent numbered builds (e.g. 1.2.1 and 1.2.2).

YSlow Data Plot

The YSlow data plot is in the same format as the size plot, but plots the scores for each of the YSlow categories on the Y-Axis.

Timing Plot

Each page has at least the load time recorded, along with other actions as apprpriate. The timing plots show these in the same way as the size and YSlow plots, with times averaged over multiple runs.

Delta Plot

The delta plot was introduced to give a good "top-down" view of the site changing over time. It shows the change in page size relative to the previous numbered build for every page that is monitored. This allows you to see if there is an issue that affects every page on the site (the mass of lines will spike upwards together) or a large change that affects only a single page (a single line will break away from the pack and be noticible).

Whats happened since Google Test Automation Conference

We have been in contact with Yahoo! and have asked for our code changes to be merged in. They have had similar thoughts to us and have already implemented some changes and will add the rest. They also have implemented a YSLOW.firefox.run() method that can be used to trigger a YSlow run. This means that we can easily run it with selenium.GetEval().

So that is a quick run through of the work that was presented at GTAC. If you want to know more, the slides and videos from GTAC are embedded below (and available with the rest at GTAC.biz).

There was also a "live-waving" commentary of the presentation on Google wave. You should be able to find it by searching "group:gtacgroup@googlegroups.com"





Monday, 19 October 2009

Automated Performance Collection and Reporting

On Thursday, two of the smartFOCUS Digital development team will be speaking at the Google Test Automation Conference in Zurich, Switzerland.

David Burns and David Henderson will be discussing the framework that they developed to record the performance of smartMARKETER eChannel. They will be discussing how they went from a manual approach to a fully automated process for recording performance data. From this data we can produce a view of the user experience we are offering. The initial results of this technology has significantly improved the performance of the smartMARKETER eChannel admin booth over the past several months.

Performance Framework

The framework works by recording the structure of the page that has been rendered and then parses the page through a ruleset to produce a score of how the page had performed.



This score has helped us prioritize work so that we can offer the best user experience to our users. The framework has shown that we have made reductions of up to 85% in the size of the page that is delivered to our users. This has meant that our users, who use the system for many hours of the day, do not need to download as much data when they use our application.The framework has also pointed out other optimizations, like the format of the data we transmit, that needed to be made so that our users can have a very good experience with our application.

They will be explaining how this framework was developed and how it fits into our monthly iterations. David Burns will also discuss how it is part of our test automation strategy within the company. He will also be discussing one of the side projects that has spun-off this work. The "Testers Heads Up Display" allows testers to use this performance data, with other data feeds like bug tracking and source repository to help diagnose issues as they explore the system.

We will be publishing the slides from their presentation with a demo of the reporting portal that was created. The presentation will also be recorded and will be available on YouTube a few weeks later.

We wish them luck with their presentation on Thursday afternoon!


Friday, 25 September 2009

smartFOCUS DIGITAL at Google Test Automation Conference

smartFOCUS Digital is pleased to announce that two of its developers will be speaking at the Google Test Automation Conference to be held in Zurich, Switzerland next month. David Burns and David Henderson will be speaking about a framework they developed to automate performance test data collection and the reporting of the data. KPIs are generated which focus development efforts on the most important parts and track improvements of the product.

Details of the talk can be found here. We wish them luck with their talk!

Thursday, 17 September 2009

Rebuilding the Development Process Part 1

Early in 2008, our development programme was in trouble. We were developing a long-standing web product with a very stable and well-proven core. We'd tried to expand quickly and had hired new people, but it wasn't really giving us returns in productivity. The existing system worked well, but we were having a lot of trouble getting enhancements completed.

Part of it was that we had made a bad hire - a developer who hated the technology we use, hated us, and generally preferred to play on his Playstation rather than doing any work. It dragged the team down for a while. But even when he was out, we still had work to do. Also a backlog of feature work meant that a lot of developer time was being redirected to urgent client fixes, displacing planned features, and ironically increasing the backlog.

We had - and still have - really great people. Bucketloads of experience and technology. But in early 2008 we still weren't productive as a team. The usual development issues plagued us, which will be familiar to anybody who's run a big development project. New features were breaking existing parts of the system, everything was running late, we kept having to break off to meet customer needs. As always, the customer needs were *always* urgent, and pushed us further off track.

We decided to make three changes to start with:
* Invest in automated test
* Create continuous integration servers
* switch to monthly releases (see a later post)

Automated Test

We figured that with our limited test resources and a very large system, automated test was the only way to go. Just walking through the system manually took days. So we started to invest in various types of automated test:

* Functional testing, using the SOAP API that we expose for clients
* UI testing. Options for testing Web UIs are very limited. We picked Selenium, which has turned out to be a really good choice.
* Module testing, using NUnit.

We started small, with relatively few tests, and picked off some low hanging fruit by automating the existing APIs, and some of the simpler pages. Enough to get results with modest investment.

Continuous Integration.

We wanted something that would let us know as soon as something got broken by a change. CI, using CC.Net fitted the bill perfectly. It tirelessly rebuilds the latest state of the project and re-runs a key subset of the automated tests. Automating the whole build process up-front was too hard, so we took a stepwise approach, first picking off the parts of the system that were easy to automate. Then over a few months, we bit-by-bit automated the whole build.

Combining the two approaches together, we then made the CI server run the key automated tests. Being open source, cc.net is easy to extend and very flexible. It neatly integrated together tests on all the different technologies in the system - Javascript client code, ASP, ASP.NET, VB and C# back-ends. So we were able to add tests into the CI process one-by-one.

Early success came with the immediate feedback when the build was broken by code being checked in. CC.net alerted the development team within a couple of hours, and the developer could fix their code quickly and with minimal effort. Without continuous integration, we wouldn't have found the bugs in the new checkins for weeks or months, by which time it would have taken major investigation to identify the cause.

Payback

So we were convinced that the new proceses were helpful, but did they make business sense?

Setting up the CI servers took about 4 person weeks of time, and on its own saved about 2 person days per month. So direct payback time is about 10 months. But the biggest savings are that, because everyone is developing against an up-to-date solution, bugs are noticed early, they can be fixed quickly and informally, and this saves bucketloads of time for developers.

Setting up the first batch of automated tests took around 3 person months, and directly saved around 3 days per month. So payback time of about 1.5 years. Not so good?
* Automated functional testing and UI testing are great, and also mean that bugs are noticed early, so they can be fixed quickly.
* Automated module tests are more controversial: some of us are keen, while one believes they have a negative value. The upside is that they spot side-effects of programming changes, as when a 1-line change breaks a function. The downside is that they can make code brittle, and code refactoring takes longer because of patching the tests. Also module tests do not deal well with the wierd intermittent exceptions that you get when system services are stressed - the best approach is to make a guess and simulate them by "mocking", then extend the module tests when errors show up in production.

The final improvement was in the agility of the development and release process. Less introduced bugs were making it out to the field, so we could start to develop and deploy faster, with less risk of causing problems for clients or our support teams.


Conclusions

Implementing automated test and continuous integration:

* both pay back within a year, considering direct savings only.
* together they give significant synergy - the savings are "more than the sum of their parts"
* give many savings and benefits from outside of the test function, with significant benefits for development, support, operations and clients.
* give greatly increased agility of the development and deployment process. This allows new developments to be completed faster and with less business risk.

Overall, implementation of CI and automated test was a major success in 2008. We released software approximately 3 times during 2008, with quality and productivit steadily improving. The next steps were to speed up the development process to monthly iterations. I'll describe this in a later post.

Wednesday, 16 September 2009

About Us

smartFOCUS DIGITAL creates software for digital marketing. That includes
  • email marketing
  • integrated microsite hosting
  • SMS and other mobile channels
  • social network marketing
We usually host this software for our clients, who are typically medium to large companies based anywhere in the world. Clients use the software on a SaaS basis, and we have several data centers. We're part of the smartFOCUS group, producing intelligent marketing and campaign software for off-line channels as well digital.

This Blog will cover the thoughts and progress of the development team here in the south of England. It'll include development management/process, development techniques, test - and particularly automated test.

We started migrating our development process to AGILE in 2008. At an early stage, we made a big push towards Continuous Integration and Automated Test. Then from Jan 2009, we switched the development process to monthly iteration cycles. The results have been really positive and interesting - I'll talk more about these in a later post.