11 August 2015

Most software architecture diagrams are useless

The best software architecture diagrams act as a map, clearly expressing a coherent solution to anybody who is unfamiliar with the terrain. Most diagrams fail this basic test and merely create contrived complexity that reflects the confused thinking of their author.

In many cases, the act of documenting a solution design benefits the architect more than any other stakeholder. Creating a diagram is an opportunity to organise your own thoughts, get your head around the solution and learn how to express it clearly.

The bottom line is that if you can’t express a system in a clear and concise diagram then you probably don’t understand it properly.

Targeting the audience, not a methodology

I rarely encounter projects where UML is used as a common currency to describe the system. Nobody really understands it beyond development teams of a certain vintage.  It’s a weighty and detailed methodology that doesn’t fit so well with agile or test-driven approaches to development.

Diagrams only have any value if people understand them. There’s no point pushing out a UML 2.0 compliant sequence diagram unless your audience know what the boxes, lines and arrows are supposed to represent.

If a diagram is to be understood then it needs to be clear and immediate. This is undermined if familiarity with any particular methodology is a prerequisite to understanding. Sometimes it’s best to tailor a diagram for the particular audience, which can even justify a little artistic license if you are communicating more clearly.

Less is more

One of the most common mistakes is yielding to the urge to include everything that could possibly be significant. The diagram becomes a sea of boxes with intersecting lines that resemble a nightmarish traffic junction. The meaning of the diagram becomes diluted as it starts to resemble a “shopping list” of concerns that ought to be taken care of somewhere in the system.

George Miller’s “magic number seven” is relevant here. This widely-cited paper suggests that the largest number of objects the average person can hold in working memory is seven, give or take two. Any more than that and it tends to take longer for people to make decisions based on the information provided.

It is usually both impossible and unnecessary to describe a significant system in a single diagram. Some approaches seek to manage this complexity by providing different views of the same architecture, but it can be just as effective to split the system into smaller, more manageable chunks.

A diagram is an abstraction. An abstraction should remove unnecessary detail rather than create a separate representation of this system. After all, most diagrams are there to communicate something quite complex.  It’s easier to do this by reducing the amount of detail on the page.

The model-code gap

Software architecture is often expressed as a set of diagrams that don’t reflect the reality of what’s happening in the code. A set of diagrams may describe a neat schematic with separate functional, logical and physical views that fail to describe the fudges and messy coupling that have actually been implemented.

George Fairbanks identified what he calls the “model-code gap” to describe a more fundamental disconnect between models and code. Architecture models contain a mixture of abstract concepts, technology choices and design decisions that cannot always be mapped into code. The end result can be source code that does not necessarily conform to the arrangement of components laid down by the model.

You do need the abstractions provided by architecture as they help to manage complexity and scale. You also need to explicitly manage the model-code gap by working with teams to ensure they understand the architecture and can implement it.

If there really is a mismatch between code and architecture, then you need to challenge the effectiveness of the architecture team. Are they just sitting in an ivory tower producing diagrams that bear no resemblance to reality?

Code is the only artefact that accurately describes the software architecture. The problem is that code is way too detailed to serve as a meaningful document. Diagrams generated automatically from code are universally awful and cannot express a design abstraction in any meaningful sense.

Architecture should establish a common language to describe a system. There should be a relationship between the abstractions and conventions laid down by architecture and those actually used in the code. Components should be named consistently and the main design decisions clearly understood by everybody involved in development. This implies that architecture diagrams only add value in an agile context at a relatively high level of abstraction.

A fair litmus test for the validity of any diagrams is whether it is actually reflected in the code itself. If not then the diagram is effectively useless.

Filed under Architecture, Development process, Rants.