15 January 2013

Sharing code between geographically distributed development teams

Large-scale development increasingly involves distributed teams as organizations seek to manage costs and leverage resources on a global scale. This can give rise to development teams spread across continents and time zones, each of which is working on different aspects of the same code base. Once development starts to scale in this way then problems of distance and communication can manifest themselves in a number of different ways.

There are practical problems such as managing the timing of nightly builds around non-stop global development or accessing source code over slow networks. There are also more intractable problems of control and communication. It can be difficult to enforce common development practices from a distance, particularly if you are trying to allow a degree of autonomy and encourage some creative freedom. It is also difficult to define clear roles and responsibilities when the people involved rarely get the opportunity to meet each other.

These challenges can only be overcome through a combination of solutions. Code sharing should be driven by clear policies but it has to be enforced through effective automation. This should be augmented by distributed technical leadership, good solution design and above all by effective and direct communication.

Different approaches to code sharing

There are two approaches to sharing code between distributed teams:  you can share the compiled binaries or share the entire source code base directly.

Sharing binaries may seem like the most straightforward approach and it can help to guarantee clear ownership by preventing any unauthorized changes to shared code. However, this approach does tend to undermine collective understanding of shared code as it reduces the visibility of component implementations.

If you are sharing binaries between teams there is an inevitable time lag between features being developed and made available to other teams. Requests for changes will inevitably take longer to code, build and release. Working with consistently changing binaries can also create difficulties with dependency management, though this can be mitigated to an extent with good release procedures or packaging technologies such as NuGet.

Sharing binaries is typical of a low trust environment where teams do not want other teams touching “their” code. Sharing the source code directly is a more “open” approach that implies greater trust between teams. Many of the benefits to open source software apply to distributed development within a company and direct access to source code can facilitate better understanding of code and easier debugging. Problems over rogue check-ins can be averted by clear planning or, failing that, managing check-in permissions in source code control.

Although direct source code sharing can facilitate more agile development on a global scale, this approach is not without its problems. It requires more bandwidth for a source control system as frequent updates are passed between teams. There can be practical difficulties with replicating a complex build process across different sites, particularly if there are a mixture of tools and configurations in play. Without careful management the visibility of source code may not be a good thing after all, particularly if partial updates leave it in an unstable state.

Optimising source code management

It’s easy to underestimate how remotely-hosted source code repositories will impact development. Not only will it reduce the pace of work but remote teams might start to feel like second class citizens if they do not have equal access to the shared code base.

This should be straightforward to address so long as you are able to distribute source code control. You don’t have to use a fully distributed system such as Git, but each location should have access to a local server that contains the code that they work on frequently. Centralized source control systems such as TFS provide support for this through proxy servers that provide shadow copies of the repository.

Ensuring code quality through automation

Published standards are no guarantee of compliance or quality. With distributed teams you will need to automate the enforcement of some basic standards through static code checking tools such as FXCop and StyleCop.  This is particularly the case for styling rules and naming conventions as a huge amount of time can be wasted on petty disputes over coding style.

Ideally, code quality compliance and unit test coverage should be verified before code can be checked into source control. This is referred to as a “gated check-in” in TFS and it helps to enforce a clear contract between developers, i.e. any code that is checked in will comply with agreed standards over style and content.

Many developers find automated code checking a little draconian, but it does reduce the manual resource needed to manage shared code. It can help to provide a more accurate picture of the state of the source code and any non-compliance can be flagged for review rather than being left to fester in the code base.

Distributing technical leadership

With development spread throughout different time zones it’s vital that each location has some genuine technical leadership on the ground. Relying on a single location to provide all the guidance and high-level expertise is not going to be effective unless you can convince them to stay awake 24 hours a day.

A common mistake in setting up satellite development teams is getting carried away with the potential savings in labor costs and populating the office with relatively junior developers. Without experienced technical leaders productivity is likely to suffer and remote teams will struggle to produce commercially viable code. Trust issues will start to crop up between teams as the quality of code will be noticeably worse in an office of junior developers.

Designing for distribution and the role of contracts

The more internal dependencies a system has then the more difficult it becomes for multiple teams to get involved. A system based on loosely coupled components with clearly defined responsibilities is far easier for distributed teams to work with. Teams can take ownership of particular components in isolation and the chances of conflict are greatly reduced.

The interfaces or contracts which define the interactions between these components are absolutely vital and these will need clear ownership. A set of well-designed and consistently enforced contracts will help to define the rules of engagement between teams and express clear accountabilities.

The importance of face-to-face communication

The importance of a good communication strategy cannot be emphasized enough. Process and tools are absolutely no compensation for people sitting down together and developing a constructive working relationship.

Regular phone calls or expensive video conferencing systems are no substitutes for face-to-face communication. Getting people to meet each other and work together if only for a few days here and there can do wonders for shared trust and mutual understanding.

Every time I have worked with distributed teams the working dynamic and productivity has been transformed by introducing more face-to-face communication. Video conferences just do not cut it. You need to fly a few key people over to break the ice. Let them spend a few days working together and even go out for dinner. If this involves shelling out for ten hour flights and expensive hotels you will look back on the investment as a turning point in productivity that helped to shape a shared sense of purpose.

Filed under Development process, Strategy.