2 June 2016

Is “Serverless” architecture just a finely-grained rebranding of PaaS?

“Serverless” platforms such as Amazon Lambda and Azure functions  do not claim to eliminate the need for servers. They just provide an extra layer of abstraction so developers do not have to worry about them. In this sense they can appear similar to existing PaaS platforms such as Heroku.

The difference is in the level of granularity as the the basic unit of deployment becomes a single function. In this sense “functions as a service” might be a more appropriate description. You build a solution based on tiny services that can spin up almost automatically in response to demand. It’s a compelling idea that allows features to be scaled out automatically. There’s no idle capacity and zero down-time.

It’s still a PaaS solution

This level of responsiveness may sound seductive but it does not overcome many of the shortcomings of PaaS provision.

The detail of any infrastructure might be abstracted away, but there’s still a bunch of servers, operating systems and run-times involved in execution. “Serverless” is really just a matter of perspective as somebody, somewhere has to look after the servers. In this model, developers are just delegating operational responsibility to a third party supplier.

This does represent the mother of all vendor lock-ins. You are coupling your entire solution to a third party and every function in every environment will run on their infrastructure. There’s no switching provider. Ever.

The environment also becomes a black box that you don’t have any access to. There’s no logging into machines to investigate problems or tweak environment configuration. This is the price of abstracting away the operating system and run-time – there can be occasions when you really need to see what is going on under the hood.

Paying for what you use can be a double-edged sword

Serverless platforms do offer the potential to manage cost more effectively by eliminating unused capacity. This does not necessarily translate into lower cost as there are several pain points involved in managing a process-based costing model.

If you pay for processing time there is inevitably a tipping point where it becomes more expensive than server provisioning, particularly for systems that think in terms of thousands of transactions per second. It can work out cheaper to provision a server with lots of memory and multiple cores rather than paying for billions of processes every month.

It’s easy to make costly mistakes with this kind of on-demand infrastructure. The discipline of having to tune processing to a finite set of resources can force a certain level of efficiency. In a serverless environment you can scale your processes so easily that you won’t necessarily notice how inefficient they are. This can lead to some costly mistakes.

Modelling the true cost of processes can be difficult, especially as you generally have to factor in related services that are invoked by running processes, such as message queues and data persistence. Most developers don’t really consider cloud pricing models when they are writing their code. It’s much easier to consider pricing in terms of large server-based units than more finely-grained and unpredictable processes.

Paying for what you use can be a double-edged sword as you are vulnerable to unexpected spikes in demand. Very few developers can predict how users can interact with the system until after it has gone live. Elasticity can be dangerous as many platforms do not directly support spending caps.

The distributed big ball of mud. As a service.

Building a platform where the basic unit of deployment is a single function can certainly provide flexibility. It can also give rise to fantastically complex, tightly-coupled systems.

With processing units this small it can be difficult to isolate the data and behaviour between each unit so they can be genuinely autonomous. Each function will tend to become coupled to numerous other functions in a complex dance of real-time dependency otherwise known as the “distributed ball of mud”.

This style of architecture is also difficult to control overt time. Given that there is so little overhead to adding a new function the temptation can be to address every new problem with a new function. Without careful governance you are left with numerous different implementations of the same feature and a lot of code that nobody remembers writing.

Serverless architecture also has remarkably little to say on the subject of data persistence. All those stateless functions need to store their data and state somewhere and this inevitably ends up in one massive shared database. This couples every function to the same data schema making it more difficult to make changes without unexpected consequences. The shared database will also create a single point of failure that defines the scaling limits of the system due to data contention.

In this sense stateless architectures can resemble the kind of systems built from database stored procedures in the 1990s and early 2000s. This has left a legacy of systems with badly-organised logic hosted in an inflexible runtime with a dependency on a single, enormous data store. The one difference is that you get more choice over where your processing happens, but the maintenance and scaling issues remain.

Is the world about to go serverless?

Despite what the marketing hype would have you believe we are not in the midst of an evolution that started with containers and will finish with serverless platforms.

Serverless functions can be a good fit for stateless transformations that have demanding requirements around elastic scaling. One of the more convincing case studies I have seen is for an image upload system where stateless functions were used to process user input on the way to a file store. As ever with this kind of system, there lurks a huge great shared database of state on the fringes of the architecture.

Functions are supposed to be stateless so they can spin up quickly in response to demand. This may rules out anything that relies on loading up significant amounts of reference data. Stateless architecture seems like a good fit for the predictable and repetitive workloads that need to be scaled up to accommodate significant peaks.

Serverless could be seen as part of a wider trend that involves a cluster of technologies that allow you to compose applications out of autonomous services. The direction of travel certainly seems to be towards smaller, more self-contained services and greater automation. This does not have to involve abstracting away the infrastructure to a PaaS provider.

Filed under Architecture, Azure, Design patterns, Microservices, Strategy.