jump to navigation

How to pick a good Computer Science degree May 16, 2007

Posted by Imran Ghory in Computer Science, Software development.

I recently came across a scathing article by a former Computer Science professor from the University of Leeds attacking his former department for “dumbing down” their curriculum. In the discussion on reddit that followed the question was raised about which universities still have good compsci departments.

I’d thought I’d try to answer that indirectly by coming up with a set of criteria any prospective student could use to judge for themselves how good an undergraduate compsci course is.

So without further ado the criteria:

  • How difficult are the modules/subjects offered ? – do they include math heavy topics such as cryptography, complexity, quantum computing, speech processing, etc. Do they include theoretical topics such as information and concurrency theory. Are the programming courses in depth, do they cover functional programming and language engineering or are they just “what is a loop” lectures.
  • Are there specialist units/subjects taught by researchers in that area ? – If a uni is teaching unique courses that are only available at a handful of universities around the world due to the specialisms then that’s probably a good sign. It means lecturers are driving the curriculum and you’re likely to have lecturers who are genuinely passionate about what they teach.
  • What’re the average entry grades for students ? – this matters for a number of reasons, not least because having intelligent motivated students means that lecturers won’t have to dumb down their material. Lecturers have to make sure they’re teaching at a level right for their students. If you’re an A* pupil in a class of D students then you’re going to feel unchallenged as the work will be aimed at a level far below you.
  • Who recruits at the university ? – Large tech companies tend to have a very good idea which universities are producing the best compsci graduates based upon the quality of those graduates they’ve hired. So look at a university’s website and see what companies regularly recruit there. Most big technology firms, investment banks, consultancies, etc. have campus calenders on their websites showing where they recruit.
  • What do students do for final year projects ? – if the majority are doing “e-commerce websites” then it’s probably time to run away. If the majority are doing “hard-core” innovative and interesting computer science projects across a range of areas then it’s probably a good sign.

Does anyone have any other suggestions for good criteria – can we establish the equivalent of The Joel Test for universities ?

Using FizzBuzz to Find Developers who Grok Coding January 24, 2007

Posted by Imran Ghory in job interviews, Software development.

On occasion you meet a developer who seems like a solid programmer. They know their theory, they know their language. They can have a reasonable conversation about programming. But once it comes down to actually producing code they just don’t seem to be able to do it well.

You would probably think they’re a good developer if you’ld never seen them code. This is why you have to ask people to write code for you if you really want to see how good they are. It doesn’t matter if their CV looks great or they talk a great talk. If they can’t write code well you probably don’t want them on your team.

After a fair bit of trial and error I’ve come to discover that people who struggle to code don’t just struggle on big problems, or even smallish problems (i.e. write a implementation of a linked list). They struggle with tiny problems.

So I set out to develop questions that can identify this kind of developer and came up with a class of questions I call “FizzBuzz Questions” named after a game children often play (or are made to play) in schools in the UK.

In this game a group of children sit around in a group and say each number in sequence, except if the number is a multiple of three (in which case they say “Fizz”) or five (when they say “Buzz”). If a number is a multiple of both three and five they have to say “Fizz-Buzz”.

An example of a Fizz-Buzz question is the following:

Write a program that prints the numbers from 1 to 100. But for multiples of three print “Fizz” instead of the number and for the multiples of five print “Buzz”. For numbers which are multiples of both three and five print “FizzBuzz”.

Most good programmers should be able to write out on paper a program which does this in a under a couple of minutes.

Want to know something scary ? – the majority of comp sci graduates can’t. I’ve also seen self-proclaimed senior programmers take more than 10-15 minutes to write a solution.

I’m not saying these people can’t write good code, but to do so they’ll take a lot longer to ship it. And in a business environment that’s exactly what you don’t want.

This sort of question won’t identify great programmers, but it will identify the weak ones. And that’s definitely a step in the right direction.

Unit Testing: The Final Frontier – Legacy Code January 4, 2007

Posted by Imran Ghory in Software development, Software Testing.

These days unit testing seems to be synonymous with extreme programming and test driven development, however unit testing can be much more than just a programmatic way to specify and enforce a spec.

Consider the case where you have some confusing legacy code which you need to change to make a bugfix, the fix itself is trivial but you’re afraid that you’ll break something. If the bugfix is important, you might just have to make the change and hope that if it has negative side-effects they’ll be major enough for QA testing to notice, otherwise you might just avoid fixing the bug instead. If this code was unit-tested you could make the change with much more confidence.

Although the Pareto principle (aka the 80:20 rule) is heavily over-used in software, it does actually seem to apply to bugs in legacy code in a measurable manner. Have a look at your own source code repository and see which functions/classes have had the most bugfix checkins applied, 80% of bugfixes tend to be made to about 20% of the code. There’s sound logic behind this – often that 20% of the code is poorly written with dozens or hundreds of “special case” hacks.

Whether this bad code arose out of evolutionary effect or just poor coding practices the end result is the same: Unreliable code which takes far more effort to maintain then it would be just to rewrite or refactor.

However developers are often so afraid of the consequences of making a mistake they suffer the consequences of “refactoring paralyisis” that they’d rather perpetuate the bad code by adding hacks rather then trying to improve the code.

Hopefully I’ve convinced you of the practical usefulness of having unit tests for legacy code and now have you asking the question: “How can I unit test code if I don’t understand what it does ?”

The answer is surprisingly trivial, but if you come from a TDD background you may well have overlooked it. When unit testing legacy code you don’t need to understand what the code should do, you only need to be able to observe what the code currently does.

You don’t break legacy code by making it’s behaviour incorrect, you break legacy code by make it’s behaviour different. Given the code is currently behaving correctly (in a general sense anyway) you can use the current behaviour to establish correctness for your new code.

The best part of this type of testing is that in most languages you can automate the generation of this sort of regression unit test. I’m not aware of any mainstream packages which automate the process, but for most languages it is fairly straightforward to automate this process yourself.

For example while I was working on some legacy C code I wrote a perl script which did the following:

  1. Read in the header files I gave it.
  2. Extracted the function prototypes.
  3. Gave me the list of functions it found and let me pick which ones I wanted to create unit tests for.
  4. It then created a dbx (Solaris debugger) script which would break-point every time the selected function was called, save the variables that were passed to it and then continue until the function returned at which point it would save the return value.
  5. Run the executable under the dbx script, and which point I proceeded to use the application as normal, and just ran through lots of use cases which I thought would go through the code in question and especially cases where I thought it would hit edge cases in the functions I want to create unit tests for.
  6. The perl script then took all of the example runs, stripped out duplicates, and then autogenerated a C file containing unit tests for each of the examples (i.e pass in the input data and verify the return value is the same as in the example run)
  7. Compiled/Linked/Ran the unit tests and threw away ones which failed (i.e. get rid of inputs which cause the function to behave non-deterministically)

Although the above looks fairly complex, it was only about 200 lines of hacked together perl. The resulting tool let me rapidly create regression unit tests.

Obviously it has problems with some situations (functions accessing external resources, global variables, structs containing pointers) but none of these are insurmountable and could be worked around by writing a more sophisticated script without a vast amount of effort being required. Even this very primitive approach resulted in a set of unit tests that have proven themselves by spotting mistakes that I accidently introduced while working on legacy code.

If you’re working with a more modern language in which most classes/types are serializable and in which you don’t have pointers then you may well be able to avoid many of the problems the simplistic approach above has.

One common question I’ve had about this approach is of how to test functions which access external resources, one simple method is just to record the input/output from a live run and just create a dummy function which acts as a black box mapping a fixed set of inputs to outputs and then have that called instead of the real function.

With most languages you can do this fairly straightforwardly without having to change the code you want to test. How to do it is fairly language specific (for example in C you could link in your alternative functions or use the pre-processor to replace the calls to the real function with calls to your replacement) but your local language lawyer can probably help.

Hopefully in the future tools which can generate these sort of tests will become available commercially or in open-source development toolkits and eventually end up as common as debugging and profiling tools, but in the mean time you’ll just have to roll your own.

It can be a pain and require a lot of language expertise to write a program/script which automatically generates this type of test for yourself, but from my experience it is well worth it.