Now that Kubernetes is on Azure, what is Service Fabric for?

Now that Kubernetes has gained traction in Azure there is some confusion over what Service Fabric is really for. Is it an orchestrator for microservices? A means of lifting and shifting legacy applications into the cloud? An application development framework?

Large areas of the Azure infrastructure have been built out using Service Fabric, but that doesn’t necessarily make it an ideal candidate for building modern, cloud-native applications. Microsoft may have eaten their own dogfood internally, but the growing popularity of Azure Kubernetes Service implies that Service Fabric is may not be an ideal choice for running container-based applications.

One of the sticking points of Service Fabric has always been the difficulty involved in managing a cluster with patchy documentation and a tendency towards unhelpful error messages. Deploying a cluster with Azure Service Fabric  spares you some of the operational pain around node management, but it doesn't change the experience of building applications using the underlying SDKs.

Building applications for Service Fabric

The main problem with targeting applications for Service Fabric is that they will lack portability. The Service Fabric SDK is incredibly opinionated. If you commit to Service Fabric, you will be tied into a specific SDK and application server for good. This is some way from the kind of "cloud-native", twelve factor applications typically associated with container-based development.

Service Fabric isn’t directly comparable to container orchestrators such as Kubernetes as it is more of an application server that supports a specific style of distributed system. This is based on applications, which serve as the boundary for upgrades in that different application versions can be run on the same cluster. The Service Fabric runtime manages the individual nodes in the cluster, providing capabilities to deploy, scale and monitor individual applications.

These applications can be composed of any number of services. The SDK encourages application code to be combined with deployment details in a single Visual Studio solution which is deployed to the cluster. The overhead of setting up a new solution tends to encourage larger applications made up of several services, rather than more numerous, isolated microservices.

Native Service Fabric services are based on very specific styles of implementation. Reliable services can be either stateless where state is managed externally, or stateful where state is managed by the Service Fabric runtime. Both types of service require base classes such as StatelessService to define their entry points, coupling them to the underlying SDK.

Service Fabric also offers reliable actors, a pattern where state and behaviour are combined into small, isolated units. The runtime looks after the persistence and life-cycle and persistence of these actors. The approach can be a good fit for scenarios that require large numbers (i.e. thousands or more) of these units, such as shopping carts or user sessions.

The catch is that reliable actors tie services irrevocably to a very specific application server. They can also be abused in much the same way as any shared database, often being used to maintain global state, provide a cache or even acts as a queue. There are usually other ways to solve the problem that don’t involve such a strong platform lock-in.

Using Service Fabric as a container orchestrator

Service Fabric also supports two other types of application: guest executables and containers. Guest executables provide a mechanism for running legacy windows applications in the context of a service orchestrator. Hey presto – instant microservices. Kind of.

Service Fabric's container support provides a lifeline for those who have made a commitment to Service Fabric yet are looking to get into container-based services. The catch is that containers in Service Fabric do feel like second class citizens. The process of configuring and running containers in Service Fabric does not compare well with a “pure” container orchestrator like Kubernetes.

Resource governance is another problem in Service Fabric, making it much more difficult to manage "noisy neighbours" that starve other services of resource. You can assign CPU and memory to services, but this is an inflexible setting that reserves capacity. This makes it difficult to plan for unpredictable load and resource consumption across an entire cluster. Kubernetes provides a more flexible system of requests and limits so you can define a baseline of resource consumption while specifying what a service is allowed to grow to.

The Azure Kubernetes Service is evolving towards providing a PaaS-based implementation of Kubernetes, which is ideal if you want to orchestrate applications without operational overhead. You can also run a single application based on one or more container images using the Azure App Service. Container Instances can be used to spin up single instances and can be a useful means of spinning up occasional jobs or providing burst capacity. Azure Batch is optimised more towards repetitive compute jobs.

Each of these services provide a more "container native" approach to development than Service Fabric.

A shrinking set of use cases

Service Fabric originally came about as part of Microsoft's internal evolution from monolithic, on-premises solutions to cloud-based microservices. It solves a bunch of problems around managing distributed applications in a cluster-style arrangement.

The problem is that it is geared around much larger use cases than most consumer scenarios. It runs super-scaled Azure services such as Azure SQL. Using it to run half a dozen VMs for a SaaS application feels like the proverbial sledgehammer for a walnut.

A few years ago it was difficult to foresee how the Docker ecosystem was going to mature and that Kubernetes would become the de facto orchestration solution. If you’ve already made the investment in building applications for Service Fabric there’s no compelling reason to abandon it immediately. However, there are limited reasons to adopt it now beyond managing a legacy estate of Windows applications.