6 November 2012

Are CRUD methods what we really want from a repository?

The basic data manipulation pattern of create, read, update and delete (CRUD) is a popular foundation for data repositories. Like all patterns, it has a place, but developers should be wary of reaching for it by default. CRUD-based repositories may appear convenient but they can make for poor service interfaces as well as giving rise to a lot of unnecessary boiler-plate code.

CRUD harks back to the one of the more basic uses for business computing – replacing the stacks of filing cabinets that used to dominate every office. Electronic filing systems allowed us to replace paper-based forms with electronic equivalents in systems that took up much less space. The corresponding applications were geared around entering data and workflows merely modelled a paper trail.

For this kind of electronic record storage the paradigm of create, read, update and delete was all that was needed. It encapsulated pretty much all the business logic that was required at the time. You enter a form, run a process against it and move it on to the next stage.

The problem is that systems and processes have become more complex and they rarely fit the simple data recording model offered by CRUD. The focus of development has moved from storing electronic documents to automating business processes. Repositories based on generic data manipulation methods provide little scope for encapsulating this kind of workflow-based logic.

A simple example involves changing the state of an order to “shipped”.  A number of processes may need to have completed before this can happen, from clearing the payment to picking the order in a warehouse. This implies a complex workflow and set of business rules that need to be completed before the order can be changed. None of this can be encapsulated in a simple data update operation.

CRUD as an anti-pattern

CRUD-based service interfaces can even be described as an anti-pattern that undermines loose coupling and the well-defined separation of concerns between components. This may be taking things a little too far, but they do tend to allow internal, private capabilities to bleed into a service’s public interface. The level of abstraction offered by CRUD is not appropriate for service interfaces and can encourage a “chatty” integration. It can feel more like working with a set of remote procedure calls rather than a service interface based on well-defined messages.

In this context, the rising popularity of RESTful services should not be seen as validation for CRUD-based interfaces. They are not the same thing. Where CRUD defines a set of primitive operations on a data repository REST is a high-level API style that operates on more complex abstractions. CRUD manipulates basic data entities, while REST allows interaction with a working system.

Unnecessary scaffolding

Many repository designs add in CRUD operations out of habit. It’s regarded as necessary “scaffolding” that will probably be used at some point. The assumption is that if you have a data table then you will need a generic set of methods to change the data in that table. While we’re at it we could be adding some basic retrieval methods to allow sorting, searching and listing.

There are two problems with this approach. Firstly, this is speculative design that provides a set of generic methods that may not even be used. CRUD-based repositories can include a lot of unnecessary boiler plate and encourage some pretty lazy, generic approaches to repository design.

Secondly, this generic approach to managing entities does not encapsulate any meaningful business logic associated with data entities. CRUD-based repositories are regarded as “useful” in the sense that they are “building blocks” for system logic that is encapsulated elsewhere. Without proper discipline this can encourage business logic to become fragmented as new systems or modules leverage the CRUD based module to create conflicting rules.

I am not proposing that CRUD should never be used. In some cases it may fit the bill perfectly. It’s just that CRUD can be a bit of a blunt instrument that can even encourage some pretty lazy interfaces. This is particularly the case for public service interfaces which should always be focused on tasks that a user wants to accomplish rather than generic facilities that a developer wants to provide.

Filed under Architecture, Design patterns.