When does refactoring become rewriting?

Developers tend to refer to any changes to existing code as "refactoring". This can include both small code improvements made as part of normal development and wholesale changes that replace a part of the application. These are very different activities that should be approached very differently.

Refactoring means something very specific. It is a controlled process for improving the design of existing code. It generally involves applying small changes to code bases without changing external behaviour. The onus here is on making gradual progress via small steps that minimise the risk of breaking anything.

The case for refactoring

The XP call to "refactor mercilessly" argued that it's not cost effective to leave old code in place that is difficult to maintain. The implication is that refactoring should be an established part of the project lifecycle. You refactor continuously to keep the design simple and avoid unnecessary complexity.

This implies an on-going process of small adjustments rather than occasionally taking a hatchet to the code base. Refactoring is not something you necessarily have to plan for separately from normal feature development. It is an approach to development that allows the optimum code design to emerge gradually rather than being defined up front or asserted in a "big bang" of change.

It can be easy to get carried away with refactoring and tip over into a large-scale re-write. A rule of thumb is that changes should not spread beyond the initial subject of the refactoring. If you are altering dozens of files, introducing breaking changes to interfaces or getting bogged down in a war of attrition then you have long since left refactoring behind.

Why does this distinction matter?

Refactoring and rewriting are very different activities that demand very different approaches to planning and testing. Referring to any change as "refactoring" can be misleading as it implies small, controlled changes that won't have any unpredictable side-effects. The danger is that teams underestimate the effort required to deliver changes and get stuck in a quagmire.

Where refactoring is a development discipline, rewriting is feature development that should be subject to the normal process of planning and testing. You need a very clear-eyed view of why you are getting involved in a re-write, preferably with a tangible understanding of the benefits you will receive once the work is done.

These benefits can be difficult to articulate or put any specific numbers against. Developers are often reduced to justifying re-writes in terms of dark threats that a system will become unmanageable or suffer some kind of unspecified cataclysm in the future.

The reality is that they are often just trying to patch up ugly or unsatisfactory code. This is understandable as developers often spend as much time trying to understand code as they do writing it. That said, a developer's natural urge to rewrite code produced elsewhere should not be a sufficient justification on its own.

Developers should refrain from using "refactoring" as an appeal to an established and widely accepted agile practice to justify something completely different. It is potentially misleading way of downplaying significant change. If you're having to estimate the effort on a bunch of changes then you shouldn't be referring to them as “refactoring”. They are a re-write.

Why refactoring is better than rewriting

The problem with re-writing code is that it rarely works. In most cases you just create a new mess. Some problems do get solved by a rewrite, but you inevitably find some new ones along the way as requirements change and your understanding of the system evolves.

It is difficult to win support for large-scale rewrites - often with good reason. They tend to lengthen the release cycle and risk upsetting users either through unnecessary change or lengthy feature stagnation. You risk destabilizing the system and introducing major new bugs, while the rewrite process tends to be subject to scope creep as you get distracted by new features or technology.

In most cases, if you want to improve the quality of your code base the best approach is to adopt a disciplined and consistent process of refactoring. It needs significant patience, but over time your code will improve its shape without incurring the risks associated with a rewrite.