Estimation pitfalls: Why software development estimates are so difficult

Estimates are, in essence, a kind of crystal ball gazing. You are making a series of educated guesses about how long something will take whilst often being hampered by imperfect understanding and squeezed by commercial pressures.

Mythical measurement

A key problem with estimates is the very unit that is used to measure the amount of effort required - i.e. “man days” or “man hours”. This type of measurement tends to produce estimates that imply that a system can be delivered with a large package of time-based effort. It suggests that time and people are interchangeable.

Fred Brooks wrote about the “Mythical Man Month” describing this approach to estimates as a “dangerous and deceptive myth”. He was writing from an early 1970s mainframe perspective, but much of his analysis holds true even on modern digital projects.

A software project cannot be produced more quickly just by throwing more developers at it. People and time are only interchangeable when there are no complicating factors such as communication, dependant tasks or learning curves to deal with.

Each new developer that is added to a project will create a communication over-head in terms of training and familiarisation. This can be planned for, but what is often missed is the increasing burden that is added in terms of the extra inter-communication within the project team. This grows exponentially as the project team increases in size – a team of twenty people will require far more effort to marshal than a team of five.

Failing to properly take account of task dependencies is another limiting factor. After all, a pregnancy could be described as a “nine man month project”, but you won't delivery it any faster by adding more people to it. A good project plan will try to take account of dependant tasks, but most only manage this on a broad level. The subtleties of task dependencies in software development will generally prevent developers from being freely swapped around between different tasks.

This problem of measurement is particularly acute for projects that are running late. The knee-jerk reaction is to add more developers to try and deliver more quickly. However, the delays caused by learning curves, communication overheads and dependency delays may serve to make a project even later if you add more developers.

Lack of Courage

It's inevitable that at some point you will be asked to reduce or increase estimates to satisfy some ulterior motives. In general, I find that sales-orientated colleagues who are trying to maximise revenue tend to accuse me of over-estimating. On the other hand, project managers, who are generally trying to deliver projects safely, tend to accuse me of under-estimating. The truth is probably somewhere in between the two.

Most estimates are produced under some kind of commercial pressure. Reducing estimates without reducing scope should always be resisted as it's just a short-term fix that stores problems for the future. You cannot just pretend that something will take less time because somebody doesn't like the cost implications.

In my experience, estimates produced freely without undue cost pressure will tend to be most accurate. They will fully take into account all the work that is likely to be required, the pace of development that can be expected and the delays that are likely to occur. Re-estimating to meet an enforced target of some kind is just pretending – it may allow you to sell a project to stakeholders but it will not help you to plan it accurately.

Misplaced optimism

Developers are optimists by nature and most estimates are based upon the assumption that something will actually work once it has been coded.

This is not realistic as many first attempts at implementations are undermined by imperfect understanding. It's only when you attempt to code and execute something that you uncover the weaknesses in your understanding of the problem and underlying approach to a solution. This creates bugs and delays.

Most projects require a large number of development tasks that are chained together. It is inevitable that a number of these individual tasks will run into development problems somewhere down the line. It would be foolish to create a set of estimates that do not take this eventuality into account and create some kind of contingency.

Forgetting the testing

One of the more common myths about software development is that a system is more-or-less ready to go after the initial development cycle and that testing is just there to firm it up. In truth, it's only half-way there when a developer first writes it.

Boris Beizer estimated that his private bug rate was 1.5 bugs per line of executable code including typing errors. The majority of these bugs are found and corrected by a developer as the code is being written. Testing can be seen as the process of weeding out as many of the remaining bugs as possible.

This testing phase is often excluded from development estimates or just added onto the end almost as an afterthought or a nod to quality assurance. Testing is, in fact, an essential part of the development process. The testing phase is likely to be the first time when all the components in a system are brought together. It's also the first opportunity for code to be executed in conditions other than those created by the developer. Testing is the part of the cycle where software is made to actually work.

Testing is the process of finding problems with software - the more you do it, the better the product. A few days at the end of a long development will not cut the mustard. For a broad sense of scale your total testing effort should be at least 25% of the total development effort.

Excessive assumption coverage

When you create a set of estimates you are often basing them on an imperfect understanding of the problem domain, particularly in a pre-sales scenario. This is where estimates can really start to feel like you are waving your finger in the air.

A common approach is to produce estimates that are backed up by a series of assumptions. This will give you “wriggle room” to adjust your estimates should these assumptions turn out to be invalid. The problem with this approach is that assumptions can become too blatant an exercise is “arse-covering” which render your estimates completely invalid. I have seen estimates backed by a list of more than one hundred generalised assumptions and these do not help to communicate how long something is likely to take.

If you don't know enough about a problem domain to provide meaningful and detailed estimates then you shouldn't waste your time on them. You could provide a wide range or a set of different estimates based on potential scenarios. Better still, you could use the time to try and improve your understanding of the solution that you are trying to create an estimate for.