Thirty cheers for RAID

null

It is quite common to make a fuss about a 30th birthday. For us mere mortals we see it as a milestone, and the time to begin to make our mark on the world. But what about a piece of technology? To reach such an auspicious age it is probably redundant, gathering dust in the corner and now bigger and slower than the latest model. There are always exceptions however…

30 years ago Randy Katz, David Patterson and Garth Gibson published a paper that changed the way we manage data storage forever. In their 1987 paper the acronym ‘RAID’ was coined and described for the first time. That same technology, although constantly evolving, is still in use today.

These three computer scientists, now quite rightly recognised as pioneers of computing, would be the first to admit that their innovation was built on some great existing work in the field of data storage. Mirroring, what we now know as RAID 1, was well established over a decade earlier and in 1977, Norman Ouchi at IBM filed a patent which outlined what we now call RAID 4.

In many ways, it was the simplicity of the framework that RAID provided and a structure that encompassed all network and server based storage that was most revolutionary. Quite simply, it brought a new elegance to data storage.

Why was data storage so important?

At that time society was hurtling into the information age. Even back in the late 1980s, before Steve Jobs introduced the world to the iPhone and when Microsoft had only just launched its first consumer operating system, data loss and availability was a big deal. Crucially, our three pioneers of the computer world saw that taking data storage seriously was going to be fundamental as we all became more reliant on data.

What is RAID?

RAID today is commonly interpreted as Redundant Array of Independent Disks, although originally the ‘I’ stood for ‘inexpensive’ but this was at a time when storage devices were significantly more expensive than they are now.

At their core, RAID solutions are data storage systems that spread or replicate data across multiple drives. This technology revolutionised enterprise data storage and was designed with two key goals in mind: to increase reliability and increase performance.

Drives within the system are configured so that data can be divided or replicated over two or more drives for load distribution or to help recover data if a drive fails.

RAID combines physical disks into a single logical unit by using dedicated hardware or software. Hardware-based systems manage the RAID independently from the host computer using a RAID controller, so the operating system is unaware of the technical workings of the RAID and sees the whole storage system as if it were a single volume connected to the host computer.

Software solutions are usually implemented within the operating system or ‘OS’ and the host system will manage the workings of the RAID. Whilst RAID is traditionally used on servers, it can be also used on workstations.

Let’s take a look at some of the basic technical terms that are often used when describing how RAID storage works:

Three key principles

1.       Parity is a way of distributing information across a RAID system which allows data to be restored in the case of a drive failure.

2.       Redundancy is the duplication of critical components in the system architecture to increase reliability and act as a fail-safe. In essence, it allows for multiple component failures to happen before the whole system fails and in the case of RAID systems, the components are the drives.

3.       Mirroring is when the same data is duplicated from one disk to another. Striping is another method where data is written across multiple disks. Different RAID setups use one or more of these techniques, depending on system requirements.

Be on the level

Standard RAID configurations are referred to as levels. There were five levels originally outlined in 1987, but more have evolved over time. Now there are several nested levels and many non-standard levels which are mostly proprietary. Today we have RAID levels ranging from RAID 0 all the way to RAID 61 and beyond, with larger companies creating bespoke RAID levels to support different applications and infrastructure requirements.

Different RAID levels have different types of redundancy and a trade-off usually has to be made between fault tolerance and performance, depending on the application. As an example, here are some of the most basic levels.

·         RAID 0 uses ‘striping’ and is the most basic RAID level. It offers no redundancy but it does increase performance. Data is striped across at least two disks and with every disk added, read/write performance and storage capacity is increased over a single drive. If one drive fails, there’s no way of the RAID controller rebuilding it.

·         RAID 1 uses ‘mirroring’, which as the name suggests, mirrors the same data across two disks, therefore it provides the lowest level of RAID redundancy. RAID 1 can double read performance over a single drive, but it gives no increase in write speed. This level allows for one drive to fail.

·         RAID 5 is a common configuration and it gives a decent compromise between reliability and performance. It provides a gain in read speeds but no increase in write performance. RAID 5 introduces ‘parity’, which takes up the space of one disk in total. This level can handle one disk failure. If you have a hot spare configured as a 5th drive, this can sit as an idle drive in the system with no data saved to it. If one disk fails, the data can be rebuilt to the hot spare by using the data in the parity across the other drives. Once the data has finished rebuilding you can then remove the failed drive and replace it with a new one, which becomes the new hot spare.

·         RAID 6 takes the concept of RAID 5 and adds further redundancy with dual-parity. This allows for data to be recreated even if two disks fail within the array. The dual-parity is spread across all the disks and takes the space of two drives.

Be prepared

As with all technology, nothing is fool proof. Unfortunately drives can, and will, fail at some point in their lifetime. It is important to remember that redundancy in RAID isn’t the same as having a backup.

In an ideal world RAID complements a backup system that makes copies (if not multiple copies) of critical data to another separate system, preferably you’d have multiple backups and at least one of them will be offsite.

So what's next?

By Robin England, Senior Research & Development Engineer at Kroll Ontrack. With over 20 years of experience in the data recovery industry, Robin develops the proprietary hardware and software tools that Kroll Ontrack engineers use to retrieve lost data from any type of storage media.

Well the beauty of RAID is that its legacy will continue. On average 2.5 quintillion bytes of data is created every day, this number is increasing and it all needs to be stored.

The types of RAID configurations will of course evolve but the basic principles of redundancy, mirroring and parity will all be as essential if we continue to store data in the same way. There is no reason why RAID when it celebrates its 60th birthday is not still the go to solution for data storage systems around the world.

Robin England, Senior Research & Development Engineer, Kroll Ontrack
Image Credit: RAID