Public cloud performance: Who comes out on top?


Public cloud providers are everywhere, literally. In fact, it’s not an exaggeration to say that the big three cloud providers - Amazon Web Services (AWS), Microsoft Azure and Google Cloud (GCP) are in a great race for cloud dominance.

Analyst research firms predict that the global public cloud market will continue to rise briskly at a compound annual growth rate (CAGR) of 22 per cent and will be predominantly influenced by the top three players. So much so that at the beginning of 2018, Forrester predicted the Big 3 would capture at least 76 per cent of the cloud platform revenue in 2018 and rise to 80 per cent by 2020. That’s big.

Yet what do enterprises really know about overall cloud performance?

Unfortunately, there’s not much metric performance data that IT architects and leaders have had at their disposal, with which to make decisions. To be clear, there is a lot of data out there comparing public cloud providers. In fact, you if you conduct a search for “AWS vs. Azure vs. GCP,” you’ll get a lot of hits, but mostly you’ll find comparisons of market share, service catalogues, pricing and data centre presence and the like. When it comes to real performance data, however, it’s pretty slim pickings. And it’s not just that you can’t find many studies of public cloud performance, but when you do, the data itself is lacking.

As a result of this large performance data gap, IT leaders and architects have had to rely on instincts, educated guesses and vendor claims to formulate their cloud strategies and connectivity architectures. But that’s no way to run a business, especially today. That’s why we recently conducted research on cloud network performance and connectivity architecture for AWS, Azure and GCP to find out.

First, using AWS is much more Internet-dependent than Azure and GCP, leading to 30 per cent less stable performance in Asia. Second, despite generally strong performance across the board, geographical anomalies persist and can make a real difference in user experience. Third, the big three have formed a symbiotic relationship between their backbone networks that supports multi-cloud networking with high performance and stability.

Enterprises formulating their cloud strategies have historically drawn on a limited set of external data to inform themselves. Most of the data available were based either on published cloud vendor service catalogues and pricing schedules or on surveys of IT professionals. Both sources of data are useful. However, metrics-driven performance data has been sparse.

What existed historically primarily measured user performance to cloud regions on a single vendor basis, and typically has been based on simple measurement methodologies using pings and traceroutes. These methods provide approximations that can be understood comparatively in relative terms, but in absolute terms can also be quite a bit off from reality due to the way that simple measurements are treated in the wilds of the Internet and cloud provider Backbones.

Connecting to AWS means more internet

One of the most interesting findings was that AWS network design forces user traffic to traverse the public Internet for most of the journey between user locations and the target AWS region. By contrast, both Azure and GCP ingest user traffic much closer to the user location. In technical terms, the difference in network design is that Azure and GCP utilise what’s called BGP Anycast routing, where they essentially advertise the network addresses for all their regions at every juncture point between their network and the wider Internet. This means that user traffic is routed by the Internet to that nearest juncture point and carried through the cloud provider’s backbone network. By contrast, AWS only advertises the network addresses for its cloud regions on a more localised geographical basis to their cloud region data centre locations. This causes the Internet to route user traffic away from the AWS backbone network until the traffic gets geographically close to that region.

Where and why does this matter?

In North America and Europe, it doesn’t matter too much from a performance perspective. However, in places like Asia, where Internet fibre routes are sparser, and where Internet performance can vary much more, carrying traffic across the public Internet ends up creating a much higher degree of unpredictability in performance. In fact, in Asia, the standard deviation on AWS network performance is 30 per cent higher (or worse qualitatively) than that of Azure and GCP. If your enterprise is doing business or hosting in Asia, this is worth considering.

Regional performance variations persist

Another key finding was that while in general public cloud performance is strong, there are some anomalies that are worth understanding. For example, GCP’s network is overall very strong. But it turns out that there isn’t yet a direct fibre route in the GCP network from Europe to the Indian subcontinent, so traffic from users in Europe going to Mumbai will take three times as long to get there with GCP as with Azure or AWS. Of course, the big three are constantly building out their networks, so these anomalies will go away over time, but it’s still helpful to have real data on which to make hosting decisions until that happens.

Multi-cloud is ready for prime time

Obviously, AWS, Azure and GCP are competing for market share and revenue and have largely similar offerings to the market. So, it’s reasonable to wonder whether they play nicely together when considering a multi-cloud hosting strategy. It turns out that any such concerns are misplaced. We saw extensive connectivity between the backbone networks of the big three public cloud providers. This means that traffic going between regions of AWS, Azure and GCP almost never traverse the public Internet, and instead transit the backbones of the big three nearly exclusively. Inter-region performance measurements across all three providers show remarkable strength and stability. So, good news for your multi-cloud strategy—the big three networks are making it safe to proceed on a performance and availability basis.

Don’t stop the data

One of the things that this study underlined for us is that good data is the lifeblood of sound operations in the cloud. Even though the big three have strong networks, they are multi-tenant infrastructures and don’t come with a service level agreement (SLA). And even the big three can suffer outages, like the BGP hijack of AWS Route 53 DNS service or a power surge that impacted Microsoft Azure’s Texas data centre that brought down many top brand websites for at least a portion of their customers. So, don’t stop the data.

By collecting network intelligence continuously and you’ll always be in the know for your cloud architecture and operations.

Archana Kesavan, Senior Product Marketing Manager, ThousandEyes
Image Credit: Melpomene / Shutterstock