jump to navigation

Unit Testing: The Final Frontier – Legacy Code January 4, 2007

Posted by Imran Ghory in Software development, Software Testing.

These days unit testing seems to be synonymous with extreme programming and test driven development, however unit testing can be much more than just a programmatic way to specify and enforce a spec.

Consider the case where you have some confusing legacy code which you need to change to make a bugfix, the fix itself is trivial but you’re afraid that you’ll break something. If the bugfix is important, you might just have to make the change and hope that if it has negative side-effects they’ll be major enough for QA testing to notice, otherwise you might just avoid fixing the bug instead. If this code was unit-tested you could make the change with much more confidence.

Although the Pareto principle (aka the 80:20 rule) is heavily over-used in software, it does actually seem to apply to bugs in legacy code in a measurable manner. Have a look at your own source code repository and see which functions/classes have had the most bugfix checkins applied, 80% of bugfixes tend to be made to about 20% of the code. There’s sound logic behind this – often that 20% of the code is poorly written with dozens or hundreds of “special case” hacks.

Whether this bad code arose out of evolutionary effect or just poor coding practices the end result is the same: Unreliable code which takes far more effort to maintain then it would be just to rewrite or refactor.

However developers are often so afraid of the consequences of making a mistake they suffer the consequences of “refactoring paralyisis” that they’d rather perpetuate the bad code by adding hacks rather then trying to improve the code.

Hopefully I’ve convinced you of the practical usefulness of having unit tests for legacy code and now have you asking the question: “How can I unit test code if I don’t understand what it does ?”

The answer is surprisingly trivial, but if you come from a TDD background you may well have overlooked it. When unit testing legacy code you don’t need to understand what the code should do, you only need to be able to observe what the code currently does.

You don’t break legacy code by making it’s behaviour incorrect, you break legacy code by make it’s behaviour different. Given the code is currently behaving correctly (in a general sense anyway) you can use the current behaviour to establish correctness for your new code.

The best part of this type of testing is that in most languages you can automate the generation of this sort of regression unit test. I’m not aware of any mainstream packages which automate the process, but for most languages it is fairly straightforward to automate this process yourself.

For example while I was working on some legacy C code I wrote a perl script which did the following:

  1. Read in the header files I gave it.
  2. Extracted the function prototypes.
  3. Gave me the list of functions it found and let me pick which ones I wanted to create unit tests for.
  4. It then created a dbx (Solaris debugger) script which would break-point every time the selected function was called, save the variables that were passed to it and then continue until the function returned at which point it would save the return value.
  5. Run the executable under the dbx script, and which point I proceeded to use the application as normal, and just ran through lots of use cases which I thought would go through the code in question and especially cases where I thought it would hit edge cases in the functions I want to create unit tests for.
  6. The perl script then took all of the example runs, stripped out duplicates, and then autogenerated a C file containing unit tests for each of the examples (i.e pass in the input data and verify the return value is the same as in the example run)
  7. Compiled/Linked/Ran the unit tests and threw away ones which failed (i.e. get rid of inputs which cause the function to behave non-deterministically)

Although the above looks fairly complex, it was only about 200 lines of hacked together perl. The resulting tool let me rapidly create regression unit tests.

Obviously it has problems with some situations (functions accessing external resources, global variables, structs containing pointers) but none of these are insurmountable and could be worked around by writing a more sophisticated script without a vast amount of effort being required. Even this very primitive approach resulted in a set of unit tests that have proven themselves by spotting mistakes that I accidently introduced while working on legacy code.

If you’re working with a more modern language in which most classes/types are serializable and in which you don’t have pointers then you may well be able to avoid many of the problems the simplistic approach above has.

One common question I’ve had about this approach is of how to test functions which access external resources, one simple method is just to record the input/output from a live run and just create a dummy function which acts as a black box mapping a fixed set of inputs to outputs and then have that called instead of the real function.

With most languages you can do this fairly straightforwardly without having to change the code you want to test. How to do it is fairly language specific (for example in C you could link in your alternative functions or use the pre-processor to replace the calls to the real function with calls to your replacement) but your local language lawyer can probably help.

Hopefully in the future tools which can generate these sort of tests will become available commercially or in open-source development toolkits and eventually end up as common as debugging and profiling tools, but in the mean time you’ll just have to roll your own.

It can be a pain and require a lot of language expertise to write a program/script which automatically generates this type of test for yourself, but from my experience it is well worth it.