Forget code coverage – test design should be driven by behaviours

Test coverage statistics are much loved by management teams and code quality tools. They tend to associate a high level of coverage with robust, well-managed code bases. Wrongly, as it turns out, as code coverage does not tell you what is being tested.

I wouldn't dispute that comprehensive test coverage is an ideal and I generally prefer a test-driven approach to writing code. The reality is that teams often practise a form of test-assisted development where tests are written alongside code rather than in advance of it. This leaves them playing catch-up in terms of writing tests to verify features that have already been written. They turn to code coverage analysis to benchmark how well they are doing.

The temptation is to game the statistics and bump up the score at all costs. In straining for higher coverage developers can produce meaningless tests in the pursuit of executing a stray piece of code. So long as the code gets executed somehow, it doesn't really matter what's being tested.

Chasing coverage and “stick-based testing”

Many automated test stacks written with coverage in mind are a form of “stick-based testing”, i.e. a developer pokes a method with a stick and is satisfied with any kind of response. This gives rise to test suites where most methods only have a single test associated with them. This may verify a "happy path" for calling a method but no consideration is given to error conditions, bad inputs or edge cases.  The method gets added to the code coverage pot and everybody's happy.

These tests give teams a false sense of security. Test suites are regarded as “complete” even though most features are not being verified as part of the build process. Many features don't get tested at all until the components are installed onto integrated staging environments.

This can create an unfortunate feedback loop where teams start to regard these environments as the place where the majority of problems are picked up. It encourages the development of large, complex staging environments as the focus of the majority of system testing effort.

These kinds of environments are increasingly regarded as an anti-pattern. They are unstable and expensive to maintain. The feedback loop associated with integrated tests tends to be very long. However, in the absence of any properly targeted unit tests then integration environments end up being the only means of verifying that anything works.

If you find a bug in these integrated environments then not only do you have a bug in your functional code, but you're probably missing a unit test. Rather than working to eliminate a bug from staging you should be fixing it for good by writing a unit test to cover the problem as part of the resolution.

Focusing on behaviours rather than coverage

There's an element of cargo cultism in targeting code coverage as developers race towards a higher coverage. Chasing after a percentage will not provide any guarantee of a robust and well-tested code base.

A more helpful approach can be to focus on the behaviours or features that are being implemented in each test.  In practical terms can involve a BDD-influenced approach and writing a simple sentence to describe the behaviour that is implemented by each test case. The description itself should be incorporated into the heading for a method, e.g. using the @DisplayName annotation in JUnit or the Description attribute in MS Test.

This simple discipline can be surprisingly effective in focusing attention on functionality. It's always difficult to describe test behaviour clearly in a method name, no matter what naming convention you are using. There's a tendency to over-abbreviate or create jumbled words that are hard to read. Focusing on a text-based descriptions makes it much easier to determine what a test is supposed to be doing.

This approach can also help to draw non-technical stakeholders into helping to define the test suite. Product owners can start to take ownership of what's being tested rather than leaving as a developer afterthought. In time you build a meaningful set of documentation for the system that can be referred to by the entire team.

You can pay lip service to agile rhetoric around test pyramids and test-driven development and still be left with an anaemic test stacks. A behavioural approach can makes it easier to identify meaningful test cases that test genuine features. The resulting code coverage will tend to be lower, but the quality of coverage will be much higher where it really matters.