22 July 2014

Using a tolerant reader for web service integrations in .Net

One of the more difficult aspects of collaborating services is managing change. No matter how carefully you define your services some degree of evolution is inevitable as requirements change and your understanding of the system evolves.

In the .Net world developers using SOAP-based services are accustomed to generating data classes directly from schemas and using serialization to read service responses. This is convenient as developers can use services with minimal amount of development but it tends to be very intolerant of change. Any significant changes to the service will invalidate the entire response, even if it only relates to data that the consumer is not interested in.

Ideally consumers should seek to be as tolerant as possible when reading service responses. This “tolerant reader” should only seek to access the data that an application is directly interested in rather than deserializing entities that will never be used.

What sort of scenarios can you be tolerant of?

No amount of tolerance will cater for removing or renaming a property that you depend on directly. Beyond this, it should be possible to keep a consumer integration going for pretty much any other change.

If you are only interested in one field then it should be the only field you seek to access. The idea is to reduce the scope for compatibility problems as you cannot be affected by changes to data that you do not need.

Version tolerant serialization in WCF

Earlier versions of the .Net framework were extremely strict about Xml schema validation. Any differences from the published schema would lead to the entire payload being rejected. A more tolerant form of Xml serialization has been in place since .Net 2.0 allowing new and missing data as well as providing fine-tuning mechanisms such as call-backs and binders.

This means that WCF contracts are version tolerant by default. You can add and remove fields and even change some data types without breaking existing client integrations. However, any scope for tolerance is often undermined by strict service definitions that make liberal use of the IsRequired attribute.

Although you can provide version tolerance with WCF, it doesn’t necessarily encourage you to. WCF is built on explicit contracts that are serialized in their entirety and it tends to be used for highly-structured, SOAP-based APIs. It seems a little counter-intuitive to expect significant version tolerance from a service that is tightly defined using a detailed format such as WSDL.

Fault tolerance and REST

REST and HTTP APIs provide more scope for version tolerance as they are not governed by a formal definition language such as SOAP. Clients can pick and choose the way they interact with the service a little more freely and services can publish an unlimited number of resources.

You can adopt a fault-tolerant approach to XML-based services in .Net by manipulating the payload with XPath statements:

using (StringReader stream = new StringReader(xml))
{
    XPathDocument doc = new XPathDocument(stream);
    XPathNavigator nav = doc.CreateNavigator();
    var total = nav.Evaluate("sum(//product/@price)");
}

Linq for Xml provides a slightly less cumbersome syntax for manipulating XML payloads:

XElement data = XElement.Parse(xml);
var total = (from p in data.Descendants("product")
                select decimal.Parse(p.Attribute("price").Value)).Sum();

Working with Json payloads via Json.Net provides a high degree of tolerance out of the box. As the examples show below, you can deserialize to a dynamic type:

var product = JsonConvert.DeserializeObject<dynamic>(input);

If you’d rather have the benefit of intellisense and compile-time checking then you can deserialize to a previously declared anonymous type:

var template = new { ProductIds = "" };
var product = JsonConvert.DeserializeAnonymousType(input, template);

Alternatively you can derserialize down to a data class that represents a shortened version of the entity:

var product = JsonConvert.DeserializeObject<ShortProductUpdated>(input);

A weakened contract

The problem with these approaches is that you inevitably form a weaker contract between service and consumer. Without a schema to guarantee the exchange a greater burden of validation falls on the consumer. Your code will be peppered with null checking statements and it will be vulnerable to data related bugs that are hard to track down.

A more tolerant approach also fails to take account of the semantics involved in any contract changes. In ignoring changes to data it isn’t using a tolerant reader may be missing significant changes to the meaning of the data that it is using. Binding to a more formal contract can help to alert consumers to this kind of semantic change.

This is a trade-off between flexibility and certainty that hinges on how much your model is likely to change. The tolerant reader approach works best when you have to manage frequent change in the underlying model. When the entities are relatively static then the vulnerability created by tolerant readers might not be worth the extra flexibility.

Letting the consumer define the contract

Contracts assert a kind of “hidden” coupling between services and consumers as they become mutually dependent on a particular contract version. This encourages an extremely cautious approach to service evolution as a service cannot tell what the impact of a breaking change might be.

A strict approach to versioning does at least give you some certainty over when integrations are likely to break. Consumer integrations are not supported unless they use the correct contract version. This does at least eliminate ambiguity as once you break this dependence on schema versions you can never sure whether your tolerant readers are about to start failing.

One way to close this information gap could be to require consumers to declare the data they are interested in as part of a request. This would at least allow the API to detect when a consumer request is out of date. This notion of consumers declaring their data requirements is known as consumer driven contracts though they do not have any formal implementation in .Net.

Despite the potential advantages of consumer-driven contracts, they can only really be applicable in a relatively closed community of well-known services. There is no established protocol for them and considerable overhead will be involved in agreeing and implementing the necessary conventions.

Another concern is whether consumer-driven contracts add further coupling between services. Services should encapsulate discrete and autonomous business functions and it’s important to maintain their conceptual integrity. It may not be appropriate for a consumer to attempt to drive the specification of a service.

A breaking change is a breaking change…

None of this provides any comprehensive answer to the problem of breaking changes. After all, if you are going to make significant changes to an API then you should expect some breakage. However, what you can do is limit what constitutes a breaking change and reduce the number of consumers who will be affected by it. A tolerant approach to consuming service payloads within the context of REST is the most straightforward way of achieving this.

Filed under API design, Integration, Messaging, REST, SOA, Web services.