Skip to main content

How to master the art of horizontal scaling

As companies begin to grow and add new customers, they often think about attacking different business verticals. In reality, the best way to scale and earn the trust of users is to think horizontal.

Horizontal scaling is a method in which you break up your infrastructure into small pieces and strategically place them around the world in various data centres. This strategy is relatively new but it will be an essential part of the Internet economy.

Out with the old

In the past, companies would deploy their infrastructure into a single data centre. When their growth exceeded the space, they went looking for a larger data centre and would pick everything up and move to a new facility.

This procedure has obvious limits when it comes to physical restraints like power and cooling (eventually your business will reach a size where it is hard to find a building big enough).

But even more importantly it is an antiquated attitude to the way the world works.

The commercialisation of IT has empowered the end user. Their expectations of services are high and businesses must deliver or risk losing out to a competitor, especially since competition, like users, is global.

In a global economy, customers are literally at every corner of the earth. For a business to thrive, they must deliver their content or services to their customers as fast and reliably as possible. Since speed is essential, having your entire infrastructure flow through a single data centre is absurd for two obvious reasons: Reliability and Speed.

Literally putting all your eggs in a single basket is an unnecessary risk in a world full of natural disasters and cyber attacks. If your data centre goes down, your customers would be cut off. Additionally, having a single data centre means that by the time your services arrive at your end user, latency is high. This can have a huge impact. An Amazon report found that 100 milliseconds of latency can result in 1 percent of potential sales lost, and a Google report found that 500 milliseconds of latency dropped traffic by 20 percent.

Load Balancing

The solution is horizontal scaling. By strategically placing data centres around the world and using load balancing you will improve your customers' experience. The load balancer, which sits in front of your web server, would know the following information:

  • All the web servers in your infrastructure;
  • The health of all of your web servers based on automatic health checks;
  • Which web servers serve certain geographic regions;
  • The percentage of total traffic that each web server should receive;
  • For DNS based load balancers, the TTL for how long the DNS record is cached for.

From all of this information, the load balancer builds its load balancing pools for the various geographic regions, so it knows exactly which web servers are available for traffic and how often they should receive traffic.

Now if one of your data centres is knocked off line, you have much greater redundancy because the others are still up and running and the experience of your customers is not interrupted.

From the start

Building your infrastructure so that it is capable of horizontally scaling across physical data centres or cloud regions is much easier when done from the beginning. The idea of opening data centres around the world may seem daunting to a start-up, but it doesn't have to be.

Outsourcing your key infrastructure to specialists has never been easier and safer, leaving you more time to focus on your core competency. This ease and the options out there for infrastructure, data collection and tracking, etc., is why the Economist believes we're entering the "Golden Age of startups".


Of course, there are still challenges with horizontal scaling, the biggest being data replication and consistency. What level of consistency is needed for your business will play a role in what approach you take.

Think about your bank account as an example. The amount of money you have in your account should be the exact same, whether you're accessing that information from a data centre in London or Amsterdam. However, if you're talking about your home address (which is attached to your bank account) changing when you move, there is a bit of acceptable lag time between when the change is replicated between data centres.

There are different approaches to how you divide your data from one facility to the next.

The more you can segment your data, the more you can make sure you're using the right tools in the toolbox for the use case. A refresher on CAP theorem is helpful for the exercise, which states that it's impossible for a distributed computer system to simultaneously provide all three of the following guarantees:

1. Consistency - all nodes see the same data at the same time

2. Availability - a guarantee that every request receives a response about whether it was successful or failed

3. Partitional tolerance - the system continues to operate despite arbitrary message loss or failure of part of the system.

For each segment of data in a horizontally scaled architecture, which of the three guarantees are you willing to not have? Once that decision is made, tool selection (for data storage and processing) and architecture (for how application data is accessed and moved through the system) to enable horizontal scaling can begin. And, even though these decisions can be hard, the rewards certainly outweigh the difficulty.

The future of scaling is happening now and it is modelled after the lean start-up movement. This scaling is lighter, smaller, and more nimble. Horizontal scaling is an excellent model to ensure that your services are always available, easy to service and resilient against a natural disaster.

Sometimes to take your business to the top, you have to go horizontal.

Cory von Wallenstein is chief technologist at Dyn.

Image: Flickr (EMSL; BobMical)