Refactoring large monoliths to microservices: strategies, risks and pragmatic reality
5 May 2015
Large scale rewrites of systems are loaded with risk. It may seem easy to start with: you just have to write a system that does the same things as the old one. The problem is that you inevitably get bogged down in unexpected complexity and deadline pressure.
A rewrite is a flawed strategy as you get drawn into reproducing legacy code in a different format. This tends to include those parts of the system that are redundant or poorly conceived. This style of “lift and shift” migration gives rise to a system that is in no better shape than the original legacy code.
An alternative approach is to gradually refactor areas of the system until the new code can fully replace the old. This can reduce risk as you can monitor progress and demonstrate incremental value. It also provides space to reconsider the design of the system and keep the legacy mistakes in the legacy code base.
Make sure you deliver new value. Quickly.
The desire for architectural change usually emerges from development teams rather than users or commercial stakeholders. Winning support can be difficult if it does not deliver immediate, tangible benefits in the form of new features.
You can make arguments about investing in the longer term health of the system and lowering the cost of future change, but this is asking stakeholders to accept a “jam tomorrow” promise of uncertain future benefits. It’s a hard sell.
Ugly code, dated technology and inefficient design are not sufficient reasons in themselves for investing in refactoring. These problems are not necessarily visible to the user and addressing them is not guaranteed to deliver any new value to the business. In most cases you can keep a legacy system going indefinitely so long as you can find developers prepared to work on it.
Any refactoring effort has to be driven by clear user requirements. If you can solve long-running performance problems or deliver new features that are not possible in the legacy system then you have a much more attractive “jam today” proposition. Ideally you should seek to obsolete legacy functionality incrementally while delivering immediate and visible benefits to users.
Maintaining the momentum
Although a gradual approach allows you to manage risk it can be hard to keep the momentum going for long enough to complete any transformation. Priorities change, people come and go and the commercial context evolves. The problems you face after a few years tend to be different from those you originally set out to solve.
This ongoing change can be manifested in an architecture that has clearly taken a number of twists and turns over the years. You can be left with an inconsistent structure that is littered with partially implemented patterns.
It is naïve to imagine that all development can cease on a legacy code base while you build a suite of new services. You may be in a hole with the legacy code but it won’t be practical to completely stop digging while you develop something new. There will be emergency bug fixes and feature requests that cannot be delivered quickly enough through services.
The important thing is to have a clear direction and travel and a pragmatic approach to getting there. Few large monolithic architectures can ever be fully refactored into a new architecture particularly quickly. You will have to make your peace with the idea that some legacy code will linger on and take a pragmatic view of the transformation.
Prepare the ground
Although an incremental, agile approach to refactoring is recommended some decisions have to be made up-front to reduce the friction involved in creating new services. When opportunities to refactor present themselves you want to be able to move quickly.
There are some broad design decisions to be made around how granular you want your services to be and how you are going to deal with issues such as distributed transactions. You will also need some level of shared infrastructure for these services to plug into.
If you want them to collaborate via messaging then you will need to select a transport and establish some “rules of the road” over formats. If you are exposing REST interfaces then you may want to consider an API Gateway or consider how your services will be discovered. Data persistence may also require a shared solution rather than expecting each service to take care of back up and disaster recovery.
Get to know the domain
When you are decomposing a large, legacy monolith it can be difficult to know where to start.
Legacy systems usually present a jumbled mess that has evolved over time. There won’t be any clear external interfaces and it may even be difficult to tell who or what is using the system. Any internal code organisation will long since have been compromised and “business logic”, so far as it exists, will be smeared across code modules, interfaces and database stored procedures.
If you can fit your domain on a single neat diagram then most of this article will not apply as you’re probably looking at something quite straightforward. Larger domains often defy coherent description and are rarely understood completely by a single person. The job of figuring out which services to develop can be pretty overwhelming.
Domain driven design can help to manage this complexity by drawing out those areas that might be usefully implemented as services. You don’t need to develop a full domain model complete with a comprehensive list entities, aggregates and value objects. A high-level view off the bounded contexts along with some headline data and behaviour is sufficient to provide a map for the way the system can evolve.
This map allows you to stay ahead of the game in terms of planning your refactoring in response to commercial requirements. You can be nimble and spot tactical opportunities for refactoring into services as you understand how they fit into a more coherent whole.
Practical strategies for integrating new services
Even if you can define your ideal service boundaries, integrating new services alongside an existing monolith is not a trivial job.
If you are lucky you may be able to extract the existing functionality and spin it out as a stand-alone service. Most of the time you will be looking to implement a new service and find some way of integrating it into existing processing.
When extracting new features it can be helpful to think in terms transferring assets from one system to another. This involves some notion of migration as you may want to transfer assets in both directions. For example, if you move order management to a new service, you may still want to export orders back to the legacy application for reporting purposes.
You will also need to intercept the processes that act on these assets. If you can identify the events that trigger these processes then you can start to divert them to a new implementation. The events are unlikely to have an obvious implementation such as a message broker, but you will be able to identify somewhere to tap into them such as a service layer or even by monitoring changes in database tables.
Above all – be pragmatic
In an agile environment any large scale refactoring requires a degree of tactical nous. You need to be in a position to take tactical advantage of emerging requirements to move gradually towards your target architecture. This requires some advance ground work but also a willingness to be pragmatic and accept that the monolith is likely to be part of your plans for some time to come.