Real-time storage tiering for real-world workloads: The ultimate guide

Application servers handle all operations that take place between users and a business’ backend applications. They are therefore mission critical to complex, transaction-based applications. But our data consumption has leapt from where it was several years ago.

Furthermore, many organisations hope to do much more with the data held on their servers; complex analytics for example. But there is a problem. The performance of many of today’s server-based applications is limited, not by a lack of technology, but by the relatively slow input and output (I/O) to disk storage.

A common way to mitigate this limitation and improve performance is to replicate active data from disks to higher-speed memory in a cache in servers and storage area networks (SANs).

This approach certainly can help, but it too has its limitations. Caching primarily accelerates reads, and the size of the cache is normally only a small fraction of the disk capacity available.

This makes it necessary to change the content constantly, heightening the risk that the algorithms employed to determine which content to cache may fail to deliver meaningful performance gains for critical applications.

Advances in technologies have created a new layer, or tier, of storage between cache memory and traditional hard disk drives: the solid-state drive (SSD). SSDs use high-speed flash memory and offer capacities on a par with those of hard disk drives (HDDs).

SSDs can be used either to produce a very large cache, which substantially improves “hit” rates, or to create a very fast and full tier of disk storage. Either way, the gain in performance can be significant for many applications.

The purpose of this article, intended for both business and technical decision-makers, is to describe the use of flash memory in SANs, in general, and how companies, such as Dot Hill Systems, have implemented SSDs in their next-generation SAN.

Storage tiering technology: in theory and in practice

Every application is limited to some extent and, as discussed above, sometimes significantly, by I/O to disk storage. Consider this: accessing data in the server’s memory typically takes just 10 nanoseconds, while accessing data on a hard disk drive takes about 10 milliseconds—a difference of 1,000,000 times or six orders of magnitude.

In between is solid state storage using flash memory with an I/O latency of 3 to 5 microseconds, making them 2000 to 3000 times faster than HDDs.

The impact on performance of this large difference varies, depending on the application in question. I/O-intensive applications, especially those accessing information randomly in a database, experience the most adverse impact with HDDs. Read-intensive and computation-intensive applications and, especially those that access information sequentially, on the other hand, are generally the least impacted.

Most other applications fall somewhere in between, with a mix of random and sequential input (reads) from and outputs (writes) to disk storage.

The traditional measure of performance for transaction-oriented workloads is the number of I/Os per second, commonly referred to as IOPS. Over time, certain rules of thumb have evolved to estimate the number of IOPS needed for any given application. An example would be the number of IOPS per mailbox in an email application.

While this approach provides a top line figure, it completely misses the temporal dimension of performance, however, which is how the workload changes over time.

In any point in time, data generating the most I/Os represents the hot data on the server. Consider the case in which transactions generated by server applications are actually being generated by human interaction. This could be when, for instance, a Web server is responding to what is trending on the Internet, or when an email server is processing messages that are being sent and received.

As these examples demonstrate, the workload is not uniform across the entire data set. Rather, it is focused on a small, constantly changing subset of the larger data set. A storage tiering system capable of delivering a significant improvement in IOPS must, therefore, be able to adapt to these changing workloads in real-time.

Storage tiering holds the potential to improve performance for all applications by locating hot data – that which is needed by the applications currently running on the servers – in the fastest tier, consisting of solid state drives. The next tier consists of traditional hard disk drives, which are normally fast-spinning to deliver the best possible performance for this medium.

Some solutions offer a third tier for archiving infrequently used data on lower-cost, lower-performance (but often higher capacity) HDDs. The better solutions also include a memory-based cache, enable the storage tiers to be pooled and virtualised, and provide data protection with support for various RAID (Redundant Array of Independent Disks) configurations.

The advantage of storage tiering is that it offers an effective balance of high performance (with the very fast SSDs) and low cost (with the relatively inexpensive HDDs). In the distant future when the cost of SSDs approaches the cost of HDDs on a per-byte basis, it may be possible to deploy a purely SSD-based SAN.

But that day is a long way away, which makes storage tiering the most cost-effective way of achieving significant performance gains today.

In theory, storage tiering always delivers excellent results, improving the performance of all applications. This is not the case in practice, however. The biggest challenge is identifying the Hot Data to move to the SSD tier because Hot Data can be fleeting, changing by the hour or even by the minute for each application.

This challenge is exacerbated in a storage area network where every application being served has its own individual and fluid set of Hot Data.

In practice, storage tiering solutions suffer from two common and sometimes significant limitations:

1) The most crippling limitation is the frequency of the data migration between tiers. Many SAN solutions are capable of moving data to the SSD tier on a daily basis only owing to the disruptive nature of the migration itself. Because halting an application to migrate the data is normally unacceptable, the migration must occur during periods of minimal or no demand, which is usually overnight. Such a daily or nightly migration may be sufficient for applications where the datasets can be known in advance (and fit entirely in the SSD tier), but it is of little or no value for all other applications.

2) The second limitation involves how both the system itself and the data migration process are managed. Most systems require the IT department to determine which data sets will be migrated at what times between the tiers. This might be part of a regular schedule (e.g. a certain day of the week or month), or on an ad-hoc basis (e.g. an event or project scheduled for the following day). Either way, a sub-optimal determination inevitably results in a minimal improvement in performance.

Overcoming these limitations will extend the performance gains afforded by caching into the SSD storage tier as shown in the diagram below. Note how the practical size of a cache limits its ability to achieve gains beyond a certain point. The SSD's far larger capacities (in the hundreds of Gigabytes range), which are also available at a significantly lower cost per byte than cache memory, make it possible to scale the performance gains considerably. But this will only be the case if the migration of data between tiers can be made sufficiently intelligent and dynamic to keep pace with the real-time changes in Hot Data.

Caching technology inevitably reaches a point of diminishing return, and this is where SSDs, with their far higher capacities, can take over to extend the performance gains into either a substantially larger cache or a much faster storage tier.

Real time storage tiering

The latest generation of in storage tiering arrays can offer technology that integrates four separate capabilities which, together, extend performance across the storage tiers considerably.

Autonomic Real-time Tiering

Recent developments in technology have helped to overcome the two major limitations found in most tiered storage systems today by (1) automating the migration of data in (2) real-time. The system virtualises both the SSDs and HDDs at the sub-LUN level using 4MB pages distributed across multiple RAID sets. Intelligent algorithms then continuously monitor I/O access patterns and automatically move hot data to the SSDs to maximise I/O operations and, therefore, improve performance of the aggregate application workload. Real time tiering technology utilises three separate processes, all of which operate in an autonomic manner in real-time, including:

1) Scoring to maintain a current page ranking on each and every I/O using an efficient process that, on decent equipment, adds less than one microsecond of overhead. The complex algorithm takes into account both the frequency and last moment of access. For example, a page that has been accessed five times in the last 100 seconds would get a higher score.

2) Scanning for all high-scoring pages occurs every five seconds, utilising less than 1 per cent of the system's CPU. Those pages with the highest scores then become candidates for promotion to the higher-performing SSD tier.

3) Sorting is the process that actually moves or migrates the pages: high scoring pages from HDD to SSD; low scoring pages from SDD back to HDD. Less than 80MB of data are moved during any five second sort to have minimal impact on overall system performance.

Thin-provisioning

The increase in the volume and velocity of data can cause storage costs to exceed available budgets without some prudent provisioning. With a thin-provisioning capability, IT managers can dedicate available storage space to volumes only when actually needed and add storage capacity transparently to any application, also as required.

This approach enables organisations to minimise initial capital expenditures and scale-up capacity incrementally over time. The technology also enables LUN (volume) size to be configured independently of physical disk space and supports LUNs up to 128TB. With any thinly provisioned configuration, it is important to know when physical storage capacity is running low.

For this reason, an IT environment which enables IT managers to establish the alert thresholds needed to receive adequate warning is highly practical. These alerts are especially critical when capacity is becoming “overbooked” or configured larger than the available physical space.

Should physical space ever become at imminent risk of being exhausted, a real-time storage array switches to write-through mode to ensure that all data is able to be written to disk.

Rapid rebuild

When a disk drive fails in a RAID-protected storage volume, it exposes the application to data loss until the drive can be replaced. What is needed, then, is a rapid rebuild to minimise this exposure by accelerating the recovery time needed to fully rebuild a failed drive in a RAID set.

Because real-time tiered storage solutions spread LUNs across multiple RAID sets, rebuilding a single RAID set affects only a fraction of all disk I/Os. With less work involved during the rebuild, the disk(s) affected return to full operation rapidly, resulting in volumes becoming fully fault tolerant more quickly. Because the acceleration afforded by intelligent software is directly proportional to amount of unused disk space, a trade-off may be required with the solution’s thin provisioning capability.

Disruptively simple user interface

Most tiered storage solutions place the burden on administrators to determine how to configure the controllers, including the creation of all storage pools and the specification of all low-level configuration details. A modern storage array helps to minimise these tasks by performing most of them automatically and makes the remaining ones as simple and intuitive as possible.

Auto-pooling technology automatically creates all storage pools, thereby eliminating the usual difficulties involving the determination of RAID levels, chunk sizes, and determining which disks to use for which vdisks, for example. Ongoing management tasks are then all streamlined so that each can be performed quickly and efficiently with no need to create and maintain complicated policies.

This type of system is remarkably easy to navigate and use, and all of the information required to perform any task is always readily available. It is also able to deliver substantial improvement in performance for read-intensive applications with half the number of SSDs needed for storage tiering. It works by copying or replicating hot data to the SSDs from the RAID protected HDDs—automatically and in real-time, just as with tiering. Data is then read from the SSD Flash Cache, and written to both the SSD Flash Cache and the HDD tier.

SSD tier performance

The chart below shows the potential performance gains achievable with tiered storage, which can typically deliver up to 100,000 random read and 32,000 random write I/Os per second. Naturally, the higher the “hit rate” in the SSD tier, the higher the gains.

But even a conservative hit rate of 70 per cent, which is easily attainable in most environments, can deliver a three-fold improvement in application performance. Far greater performance gains can be realised when the SSD tier handles 80 per cent or more of the I/O load, even though that tier represents only 5-10 per cent of the system capacity.

With autonomic, real-time tiering technology, noticeable performance gains are experienced almost immediately, even with challenging applications. But because performance gains are dependent on the SSD tier hit rate, a good management system includes an application that enables IT managers to tune the performance.

The application constantly monitors IOPS, displays pertinent performance statistics and optionally exports these to Excel-compatible .csv files for reporting, archiving and trending analysis needs. Changing the mix of applications on servers and/or the distribution of data across the SAN can increase or decrease the hit rate.

A good management system includes an application that monitors and displays pertinent performance statistics to enable IT managers to track and optionally fine-tune the performance gains achievable with tiered storage.

Conclusion

Most applications today are I/O-constrained, which limits their performance when using traditional HDDs, whether directly attached or in virtualised storage area networks. Caching helps improve performance, but fails to scale because it quickly reaches the point of diminishing return on the investment.

SSD technology that uses fast flash memory increases I/O rates by 2000 to 3000 times compared with HDDs, and a combination of the two in a tiered storage configuration offers the most cost-effective way of achieving significant performance gains today. But most tiered storage systems deliver only marginal improvements in performance because they cannot accommodate dynamic workloads where the hot data is constantly changing.

Not only are businesses becoming increasing their data usage, but we are using the in increasingly complex ways.

Indeed, good data analysis is often the point which underpins a company’s competitive advantage. Today’s technology overcomes the limitations of legacy tiered storage solutions to deliver a combination of high performance and high availability on a scalable platform with advanced features that are easy to manage with minimal IT resources.

Intelligent tiering algorithms operate continuously, automatically and in real-time to deliver the best possible improvement in performance for today’s rapidly-changing real-world workloads. And the low initial capital expenditure, combined with streamlined management, result in a surprisingly low total cost of ownership, while the performance gains yield a high return on investment.

Warren Reid is marketing director for EMEA at Dot Hill