Splitting up shared databases

6 October 2019

The problem of shared databases is not confined to monolithic systems. A big store of shared state is the guilty secret lurking in many a microservice architecture.

The main problem with shared databases is that they create coupling by schema. The cost of change tends to go up as it becomes difficult to know when you can make changes safely without breaking something in another service.

This tends to have knock on effects for the scalability of the system. You cannot scale services individually as they rely on a data store that is used by other implementations. Given that the scalability of a service is generally defined by its slowest component, a shared data store often becomes a significant bottleneck at scale.

A shared database tends to limit the resilience of a system. The shared database becomes a single point of failure that brings down multiple services when it fails. It also tends to undermine automation and continuous delivery as you are unable to deploy services in isolation without the risk of breaking something.

Wrapping

Splitting databases is hard. Many databases are there for historical reasons, leaving you with a gordian knot of hundreds of tables and thousands of stored procedures. It can be difficult to know where to start.

If the underlying database is too complex to break apart then you can consider placing an abstraction between the database and consuming services. This forms a pragmatic wrapper that makes it easier to start changing the underlying database implementation.

Assuming your database supports the notion of a view then you could isolate different schemas within the same database. This tends to be hard to enforce unless you are prepared to manage separate credentials for each service. A thin service wrapper provides a more flexible abstraction, but this will tend to require more changes in downstream services to use APIs rather than service calls.

The problem with wrapping is that it is a lot of effort for what can only be a stepping-stone to something more enduring. It does nothing to address the problems of scale and resilience that make shared databases so difficult to manage.

Replication

You can alleviate some of the challenges of a shared database through data replication. Creating a read only copy of a database allows you to separate the operational data used to drive applications from the reporting data used in analysis workloads.

This can provide a solution to those databases that are just too complex to split. Instead of trying to isolate data by schema you can segregate by the type of processing.

This separation allows you to tune both the schema design and underlying technology for different types of processing. A read-only reporting database can be heavily denormalized with numerous indexes added to speed up data retrieval. You can store aggregated data or even manage multiple reporting databases for different reporting needs. You can even use a completely different database technology.

A reporting database is arguably inevitable in a microservice-based architecture. You cannot process reporting or analytics workloads across dozens of isolated data stores managed by autonomous services. You need some means of collating this data into a format that can be used for reporting.

The approach does require a mechanism for replicating data. There are numerous techniques you can apply, depending on the underlying database technology. You can replicate the entire database, use a batch export mechanism to extract data or use change capture to stream changes to downstream storage. Your solution may be defined by your users' tolerance for stale reporting data as every technique involves some delay in updates making it to the reporting view.

Replication can solve some of the more difficult problems of data contention that undermine performance at scale. It doesn't really address the key problem of coupling by schema. There is a risk that you can compound the problem by ending up with a series of replicated shared databases in your infrastructure.

Decomposition

Decomposition is the hardest approach of all as it inevitably requires that you tackle the complexity at the heart of the database.

Many decomposition efforts start by trying to pick off outlying parts of the schema. These tend to be the more rarely used areas of data that carry the least risk. There's some wisdom in doing this if you are trying to prove out new technology, but it does nothing to tackle the real problem.

At some point you will have to bite the bullet and seek to break the shared database up into smaller component areas. This will at least help to break the single, big problem into a series of smaller ones. Ideally you should be looking to align these component areas with your service boundaries or "bounded contexts".

It's likely that any attempt to split the database will be messy, inconclusive and involve a lot of entities that are common to every bounded context. That's fine – sharing data between services is a much easier problem to solve than sharing schemas.

You will need a mechanism for ensuring that this shared data is kept up to date. This will probably have to be faster than the kind of batch update mechanisms tolerated by reporting databases. Event-driven architecture is often used here as it provides a flexible means of broadcasting changes without coupling services together.

Whatever approach you choose you will have to become more tolerant of eventual consistency. This can be difficult for developers who are accustomed to having all their data in one place. You will also need to pay special attention to service boundaries, as running separate data stores for each service is a real test for the autonomy of your services. An architecture that has grown up around a great big shared database may struggle once this is removed.

Filed under Architecture, Design patterns, Domain Driven Design, Microservices and Databases.

Splitting up shared databases

Wrapping

Replication

Decomposition

Finding service boundaries: more than just the bounded context

Data meshes and microservices - the similarities are uncanny

How big is a microservice?

Implementing complex workflows in AI Agents

What should architects focus on - and what should be delegated to teams?

Versioning doesn't make it any easier to manage change in APIs