Skip to main content

Lessons from hyperscalers: clustered storage

(Image credit: Image Credit: Flickr / janneke staaks)

Hyperscale giants like Facebook, Google and Amazon Web Services (AWS) got where they are by building scalable, reliable and robust cloud infrastructures with low latency. What’s less well known is one of the most critical elements in building these infrastructures: cloud-native software-defined clustered block storage.

Simply put, cloud-native software-defined clustered block storage reduces storage latency, boosts performance, increases scalability and provides the reliability of traditional storage area network (SAN) over existing data centre infrastructures. Typical clustered software-defined storage solutions replicate data across different storage servers, guaranteeing service and data availability whenever one or more storage servers experience transient or permanent failures.

At cloud scale, everything fails, yet everything must also continue working. Clustering storage servers together is one way to achieve reliability out of unreliable components. 

Clustered storage using standard commonly-available servers and protocols helps hyperscalers and on-premise enterprises or cloud providers make the most of their data centre investment. That’s because there’s no need to rip and replace any component of the data centre infrastructure when building the cloud-native block storage stack. Clustered storage also improves application efficiencies – whether in the cloud, on customer virtual machines or in containers.

Today, many providers hosting or building clouds are using storage solutions that are neither clustered, nor cloud-scale, nor software-defined. The prevalent storage architecture for clouds uses direct-attached hard disk drives and SSDs. In particular, many rely on NVMe-based SSDs that provide high performance and low latencies at reasonable costs. Any cloud native block storage solution should provide the performance and latencies that applications are used to and rely on. Reducing performance significantly is not a possibility; neither is increasing latencies beyond the levels applications are currently experiencing.

Flexibility and efficiency

While high performance and low latency are important, flexibility and efficiency of the infrastructure are equally important. Current direct-attached storage deployments frequently suffer from low utilisation, with some or much of the infrastructure sitting idle, and lack of flexibility. Both of these drive operational expenses higher than they should be because it’s difficult to adapt to changing workload conditions and increase either storage or compute capacities as needed. That’s because direct-attached solutions storage and compute must be scaled together.

Low efficiency and rigid infrastructures both contribute to a high total cost of ownership (TCO). A cloud-native block storage solution must follow the disaggregated storage model, where compute and storage are separated to provide the flexibility and efficiency cloud native block storage demands.

Cloud scale

The big cloud providers achieve their amazing scaling not through building one large system that grows indefinitely but rather by constructing larger systems out of many smaller systems that are loosely aggregated together. Similarly, any cloud-native storage solution performs best when it is constructed out of many smaller systems, loosely aggregated and managed as one. 


At cloud scale, everything fails. Hardware fails. Software fails. Everything fails. In the old days, SAN storage controllers were made reliable by using highly-available and reliable components and by carefully writing bug-free software, to the extent possible. In the cloud, reliability is achieved by accepting that everything will fail and constructing reliable systems out of unreliable components. Any cloud-native block storage solution should be reliable despite being built out of unreliable components. A cluster of SSD servers replicates data internally and keeps it fully consistent and available in the presence of failures. From the perspective of clients accessing the data, data replication is transparent, and server failover is seamless.

Clustered storage to the rescue

Implementing a clustering solution provides advantages for the data centre. A good clustering solution delivers a software-defined disaggregated storage solution for cloud data centres that can provide the same input/output operations per second (IOPS) as direct-attached NVMe SSDs while dramatically reducing tail latency. Additional benefits include the following:

  • Storage and compute scale independently driving up infrastructure utilisation and efficiency and providing unparalleled flexibility.
  • High reliability despite being built out of unreliable components: standard servers, NICs, and NVMe SSDs (reliability is achieved through software).
  • A natural evolution of existing data centre infrastructure can be achieved using the NVMe/TCP block storage access protocol for application-server-to-storage-server communication as well as storage-server-to-storage-server communications. NVMe/TCP runs on any TCP/IP network and is included or will soon be included in every operating system and hypervisor.
  • No need to install any drivers beyond the standard NVMe/TCP drivers, leading to simple and easy deployment on existing application servers and with existing TCP/IP networks.
  • Multiple SSD clusters can exist in the same cloud data centre and can be easily aggregated and managed as one large cloud-scale block storage solution.

Superior reliability and maintenance

An SSD clustering solution protects the storage cluster from additional failures not related to the SSDs (such as CPU, memory and NICs failures, software failures, network failures, or rack power failures). It provides additional data security through in-server erasure coding (EC) that protects servers from SSD failures. Clustering also enables non-disruptive maintenance routines that temporarily disable access to storage servers (such as TOR firmware upgrades). In these cases, enterprises or service providers continue working transparently with a subset of the storage servers in degraded mode until the maintenance routine completes and all storage servers once again become accessible.

When installed on standard servers in large scale data centres, a good clustering solution is optimised for I/O intensive compute clusters, such as Cassandra, MySQL, FDB, MongoDB, and time-series databases. With storage clustering solutions providing a highly available and durable storage layer, application teams can focus their efforts on developing new services while the solution guarantees the availability of and high-performance access to the data.

Clustering also is ideal for containerised environments such as Kubernetes that require large-scale clusters with persistent and durable storage for rapid node migration, workload rebalancing, or recovery from failures without copying data over the network.


As smaller cloud providers and enterprises look to build their own cloud infrastructures and maintain their own data centres, clustered storage provides increased flexibility, efficiency and reliability while offering better latency. Building a cluster of up to 16 storage servers can help ensure minimisation of data centre failures while helping users manage one or more server failures without data loss. For enterprises and smaller-tier cloud providers, the time is now to take a good long look at building their storage services using a clustered architecture.

Muli Ben-Yehuda, CTO and co-founder, Lightbits Labs

Muli Ben-Yehuda is the CTO and co-founder of Lightbits Labs, a startup developing infrastructure for hyperscale clouds. He was previously chief scientist at Stratoscale and researcher and master inventor at IBM Research.