Microservice preconditions: what needs to be in place before you decompose that monolith...

One of the main benefits of microservices is the promise to reduce the cost of change in a system. New services can be easily spun up to accommodate new features. Changes to existing service implementations also tend to be less risky than working them into long-established monoliths where they are more prone to unexpected side-effects.

The catch is that you need to make a significant up-front investment to realise this saving. There are some more obvious technical pre-requisites that enable the necessary development agility, such as deployment pipelines, basic monitoring and easy provisioning. Less obvious are the cultural changes, technical decisions, infrastructure facilities and governance processes that also need to be considered.

You don’t have to have every one of these issues settled before knocking out your first few services, but you should at least be aware of them. Otherwise, your microservices are more likely to be an expensive and potentially painful undertaking. This is particularly important if you are producing microservices across distributed development teams which tends to exasperate any cultural and organisational issues.

Infrastructure

There are good posts by Martin Fowler and Phil Calçado that describe the kind of technical infrastructure that needs to be in place before you can start building microservices. It may be tempting to imagine that you can get away without comprehensive monitoring or deployment automation for a small number of services, but it’s surprising how easily you can be overwhelmed.

If you can’t monitor your environments, guarantee an efficient code production pipeline or provision environments easily then you really are better off focusing on your infrastructure. Microservices can wait.

“Rules of the road”

How are your services going to collaborate? There’s quite a debate to be had here. RPC-based approaches based on REST or gRPC are direct and easy to understand, but they can give rise to very real problems over latency and resilience by coupling services together in real-time.

Ideally you want autonomous services that do not need to communicate with each other during their own request\response cycle. This allows services to keep working even when other parts of the system are failing.

This implies the kind of autonomous collaboration that can be achieved through event-based messaging. This adds significant complexity to the solution as you have to decide how to deal with issues such as duplicates, ordering and idempotency. It also tends to couple every service to a centralised messaging infrastructure. Note that issues of complexity and coupling also affect real-time communication, hence the rise of service mesh implementations to take care of issues such as service discovery, load balancing, rate limiting and failure.

This is one of those debates that can roll on unhelpfully, but you do need a clear strategy in place. In my own experience, it tends to be easier to start with event-based collaboration and ease into real-time interfaces when you are more confident about your service boundaries.

Avoiding chaotic contracts

Once you've established how your services are going to collaborate then you'll decide how to manage what is likely to become an evolving set of interfaces or contracts between them.

How do you stop a team from making changes that break one or more downstream services? How do you ensure that APIs are not just being designed for the benefit of the team that shouts the loudest? How do you manage the tendency to create brittle, bespoke integrations between services that are introduced to solve short-term problems?

There are technical solutions that can make the nuts and bolts of integration easier, such as adopting tolerant readers or consumer-driven contracts to reduce the scope for breaking change. At the very least you will need some kind of process for publishing contracts and a even a means of registering interactions between services. A format like Protocol Buffers can be helpful as it has a built-in contract and formats like Swagger and API Blueprint can be used to provide consistent documentation for REST.

Test culture

Many of the benefits of microservices are predicated on having an efficient deployment pipeline in place. This requires more than having a Jenkins server running builds. Developers will have to respect a genuine culture of continuous integration where code is committed frequently, and broken builds are attended to immediately. They will also need a rock-solid test pyramid including unit tests that focus on behaviours and contract tests that verify interfaces.

Teams that are accustomed to working with monoliths tend to struggle with testing distributed applications. This gives rise to a test culture that is accustomed to focusing effort on large, shared integration environments. These environments are expensive to maintain, notoriously fragile and create a significant bottleneck for deployment.

Local vs shared capabilities

The relative isolation of microservices can allow teams a great deal of freedom in their technology choices. So long as everybody is agreed on the “rules of the road” then it shouldn’t matter what a microservice is built in, should it?

This requires some qualification as there can be a tension between team autonomy and the wider organisation’s desire to achieve economies of scale. Delivering larger-scale technologies such as database platforms, API gateways and messaging platforms tends to be more efficient at an organisation-wide scale where you can concentrate expertise and directly manage cost.

Knowing where to draw the line between local concerns and organisation-scale capabilities is one of the main evolving decisions of a service-based infrastructure.

Appropriate governance

Governance can be a bit of a dirty word in agile environments as it suggests ivory towers and architects who are out of touch with the realities of code. Agile rhetoric tends to promote a decentralised model where autonomous teams be responsible for design decisions.

This kind of autonomy requires a level of knowledge and technical skill that will be absent from many teams to begin with. This is particularly the case when it comes to the dark art of defining service boundaries. Teams will create duplicate services, some implementations will be so big that they start to resemble monoliths, while others may be so anaemic that they are a little more than remote entities based on CRUD.

Architectural governance is still required in a microservice environment, but more as a supportive measure that defines a clear boundary within which teams are free to innovate. It’s important to provide some clarity here and be clear about the kinds of decisions that teams can make alone and those that need wider consultation. Over time, teams should become more battle hardened so governance can be relaxed.

The more teams that are involved in development, the bigger this problem gets.

Managing dependencies

You will have to consider the implications of longer-running processes that are shared across services. These can create dependencies between teams that need explicit management, no matter how loosely coupled your services are.

There will be support requests that do not fall neatly into a single team. Unexpected breaking changed will cause cascading errors after deployment. Teams will block each-others delivery schedules with conflicting priorities.

There are technical solutions that can make the nuts and bolts of integration easier, such as adopting tolerant readers or consumer-driven contracts. Inevitably, teams will have to get into the habit of collaborating and actively managing these dependencies.

Not only does this require regular shared planning, but it needs a culture of shared responsibility and co-operation. It is too easy for teams to retreat into silos, either throwing problems over a wall or, worse still, solving another team’s problems through service duplication.

Paying down the explanation tax

One of the hidden costs of moving to a microservice infrastructure is the time it takes for teams to get accustomed to new ways of looking at a system. They must become familiar with continuous delivery, understand many of the subtleties around service collaboration, and even acquaint themselves with new design techniques such as Domain Driven Design.

All this creates overhead, or “explanation tax”, that needs to be paid down before you can get any meaningful work done. Many architectural decisions are driven by pragmatic concerns around team experience and readiness. The more sophisticated the patterns and solutions you adopt, the more time you will have to spend explaining them and the more effort you will have to put into enforcing them.

You will need to build a widespread consensus around the exact flavour of microservice architecture that you have settled on. The more teams that are involved in the enterprise, the longer this kind of consensus takes to build. It can feel a little like herding sheep.