Agile metrics. You get what you measure. In a bad way.

Agile development tends not to throw up usable metrics for management teams to feed on. If the goal is to foster self-organising teams that choose their own tools and techniques, it implies there will be fewer opportunities to develop consistent performance measures. The focus on working software is an intangible goal that resists standardisation, particularly when viewed across dozens of different teams.

Is this really a problem? The notion that "if you can't measure it, you can't improve it" is a conceit that speaks to the whole industry of time and motion study. Managers seek to measure their employees' performance to help optimise output. The dangerous assumption here is that managers know how best to organise production and can define an optimum process that can be measured and improved.

This does not sit well with the notion of self-organising teams. As developers are closer to the mechanics of production, they should have better feel for how development should be organised. Demands for common metrics tends to force an expectation on teams. It assumes that they will adopt a certain methodology, adhere to a timetable or use a nominated set of management tools. This undermines their autonomy.

You get what you measure

One challenge with metrics is that they can affect the way teams behave. The observer effect in physics implies that simply observing something tends to change it. This certainly applies to measuring development, mainly because teams tend to perceive metrics as targets.

For instance, velocity is a planning tool rather than a specific measure. It describes relative capacity that shifts over time without necessarily indicating a change in productivity. It is also a completely arbitrary measure that tends to vary wildly between teams, even if you try to normalise it or force teams to adopt a common scale.

Smarter product owners who understand organisational dynamics will know how to manage statistics. If you track velocity you can pretty much guarantee that it will improve between sprints. Burn-down charts will always mysteriously tend to coalesce with the expected output line by the end of a sprint. Team dashboards will always tend to look healthy with green lights everywhere.

This behaviour becomes more of a problem when it undermines productivity. For example, measuring estimation accuracy encourages teams to give generous estimates. Tracking the number of stories delivered encourages teams to split features into smaller items.

Some measures can actively undermine output quality. Comparisons between planned and delivered work merely encourage teams to commit to less work. They will either commit to fewer stories or just do less to deliver each story. The result can be code that is not tested carefully, and solutions that are a little less robust.

Measuring release dates is just strange in an agile context. Teams should be releasing working software at the end of each sprint. All that should change is the features that are shipped rather than whether a release happens.

Some metrics can be useful

This isn't to say that metrics shouldn't be used at all. Managers have a duty of care over development teams. They need some means of tracking how well teams are performing so they can determine whether any support is needed. They manage the distribution of resources. There also needs to be a strategic view of where development is heading.

Some of the outputs of development can provide useful indications of overall health. Checking the amount of work in progress could be an indication of whether teams are producing finished software or being hamstrung by impediments. It can also be useful to measure the story pipeline to work out how long it takes ideas to be processed into delivered features.

Defects can be a useful measure, as increasing defects implies that the team may need to double down on testing effort. If the same types of defect keep appearing, there may be some missing features in terms of operational tooling. Likewise, the rate of deployment can tell you a lot about the relative stability of the system if emergency, patch or hot-fix releases keep happening outside of the normal sprint cadence.

Instead of measuring development process, it can make sense to measure the business value being generated. This can be a slippery concept that is difficult to define. Some organisations seek to place value on features in terms of the potential revenue attached to them. Customer satisfaction is a slippery concept but one that could provide a more meaningful measure of team performance if software is released regularly.

Metrics rather than targets

“Without data, you're just another person with an opinion.” (W. Edwards Demming)
Measurement is not a bad thing. It can help you to identify problems and determine how to evolve in the future. Problems can arise when metrics become targets as they tend to distort behaviour and undermine the very things they seek to measure. Business-wide metrics can also weaken team autonomy by setting expectations around how software development should be organised.

Metrics are not everything. Some aspects of delivery cannot be measured at all, such as the rate of innovation or the culture of a team. You still need to manage these things but cannot rely on metrics to do so. The danger is that in the hunt for meaningful metrics you develop a skewed view of the organisation that over-amplifies those aspects that lend themselves more readily to figures.