Ahead of the AWS re:Invent conference in Las Vegas next week, we at ITProPortal thought we’d take a look at one of the most enduring mysteries in modern IT – a secret so closely guarded that it has come to represent the hunt for Bigfoot in the world of cloud storage.
That is, the secret behind Amazon’s Glacier.
Since its launch in 2006, Amazon Web Services (AWS) has offered businesses a number of different platforms that allow IT managers to outsource expensive and resource-intensive aspects of their infrastructure to the web giant. Of these, Amazon’s Simple Storage Service (S3) and Elastic Compute Cloud (EC2) have been the most central and popular.
But of all the services offered by AWS, none have fuelled the same level of speculation and interest as Amazon’s Glacier. Though the service is well-known and widely-used in enterprise, no one knows exactly what’s behind it.
The low-cost service has been around for just over a year now. It allows users to ‘freeze’ enormous amounts of data in ‘vaults’ for a tenth of the price of regular storage. The only catch is that there’s a waiting time of three to five hours for retrieval of the data, and customers are charged if they exceed their ‘retrieval limit’; roughly five per cent of their total storage per month.
The introduction of Glacier has allowed organisations to make enormous storage savings. The cost of archiving data with the service can be as little as $0.01 (0.6p) per GB, per month. That means freezing a terabyte of data in Glacier for a month costs only £6.20, compared to the simple storage price of nearly £80.
That, as some users commented at the time, was a “game-changer”.
Since its release in August of last year, Amazon has retained a thick veil of secrecy around its most mysterious web service. The Seattle-based company has always kept the processes behind its services fairly quiet, but the omerta surrounding Glacier has been especially strict, leaving experts in the tech community perplexed about what Amazon could be hiding.
When we spoke to Rebecca Thompson, Vice President of Marketing at storage provider Avere Systems, she told ITProPortal that “Amazon won’t say. It’s tape, or some people believe they have shelves you pull out, some people believe that it’s something completely different. Some secret proprietary technology.”
Over the last year, many have asked what the particular process surrounding Glacier is, and why there’s such a waiting time for data retrieval. But the web giant, unsurprisingly, isn’t saying.
So how does Amazon do it? What’s the secret behind Glacier?
When Glacier was launched, users were astounded by the low cost of data storage. For public sector organisations like government, hospitals and emergency services, which are required to keep long-term archives by protocol, the product was an instant attraction – but the risks associated are also high.
At first, speculation was rife that Glacier was simply a magnetic tape service by another name. The waiting time, for instance, seemed a clear indication to some commentators that Amazon was just backing up users’ data on reams and reams of tape, and packing them into huge warehouses somewhere.
So is Amazon just using the oldest trick in the book?
Organisations have been using magnetic tape to store data since the early and mid-1970s. When IBM introduced the world to cartridge storage, however, magnetic tape reels began to be increasingly used only for archives and data that didn’t need to be accessed quickly.
The rolls of tape are typically wound around 26.67 cm reels, and stored in airtight containers, when possible. Despite the technology remaining largely unchanged since its inception, the highest-capacity modern tape decks can store as much as 8.5 TB of uncompressed data.
For decades, medium and large data centres have used both tape and disk storage to complement each other, with tape the favourite choice for archival and long-term data storage that doesn’t need to be frequently accessed. However, the costs of disk storage have decreased faster than that of tapes in the intervening years.
With the arrival of Amazon’s Glacier service, many in the storage industry were predicting that magnetic tape, after nearly 40 years, was finally dead.
The main drawback of magnetic tape is that it degrades over time.
Magnetic tape can fall victim to binder hydrolysis, also known by the name “sticky shed syndrome”. The dreaded sticky shed occurs when the glue holding the magnetic particles to the tape begins to break down. The condition is exacerbated by hot and humid conditions, and severe cases leave archivists with nothing but a tragic line of clear tape and a pile of once-magnetic dust.
The tape can also simply lose its magnetic charge over time, due to one of physics’ more annoying laws, a process called “magnetic remanence decay”.
In fact, your typical reel of tape has a shelf life of only 10-20 years, and that’s with optimal preservation conditions. This isn’t at all ideal for long-term storage, when patients’ healthcare details or an organisation’s transactional history are at stake.
Because of all these drawbacks, it’s fairly safe to say that Amazon isn’t simply printing millions of miles of magnetic tape and locking it up in huge containers somewhere. The risk that would involve with customer data surely wouldn’t be worth the cost-savings, and those savings probably wouldn’t even be that large anyway, what with the increasingly cheap hard drives available on the market.
Amazon has also denied using tape to power Glacier.
“Essentially you can see this as a replacement for tape,” a company spokesman said last year.
The company has also promised 11 nines of annual durability (99.999999999 per cent), a result which no one can expect from magnetic tape.
When we spoke to Ronald Bianchini Jr., the President and CEO of Avere, he told us that “It’s hard to say. Five hours is pretty fast, if it’s tape.”
So it looks like the tape theory has been pretty thoroughly unravelled.
Super-cheap hard drives
In a private email with ZDNet, Amazon reportedly credited Glacier’s mysterious operating process to “inexpensive commodity hardware components.”
Reporters argued that “this suggests the system will be based on very large storage arrays consisting of a multitude of high-capacity low-cost discs.”
This is certainly one of the most popular theories about the mystery of Glacier.
But if this is the case, how can Amazon keep the costs so low in comparison to S3 storage?
To create the infrastructure needed to store such large amounts of data, Amazon would have to build a gigantic storage area network. That means warehouses and data centres, on top of the purchasing of the hard drives. These hard drives have to be hooked up to servers, too, and those servers have to be kept powered and cooled.
All of this has a huge overhead, and it just doesn’t seem like a viable strategy for the kind of prices Amazon is offering.
What’s perhaps more important is that this explanation doesn’t explain the waiting time for retrieval of data. Is the three to five hour wait simply an arbitrary limit designed to disincentivise companies from using Glacier when they should be using the more lucrative S3?
If that is the case, that might explain the code of silence surrounding the service. Users, after all, might be a little annoyed if such an arbitrary limit was being set without any physical or hardware-based justification.
Other commentators have expressed concern that a simple hard drive-based solution would be vulnerable to the occasional massive outages that have crippled Amazon’s services in the past – the most recent of which took out Instagram and Vine in August of this year.
The more likely solution
A more likely answer to the mystery is a combination of these two options. While keeping hard drives constantly online and cooled would be expensive and resource-intensive, and warehouses full of magnetic tape would be vulnerable and wasteful, Amazon seems to have struck upon an ingenious hybrid.
Note: this is nothing but pure, unadulterated conjecture.
First, it’s likely that Amazon has a number of cheap hard drives hooked up to their servers at any one time, waiting to receive the data as users send it to the Glacier to be frozen. When these hard drives are bulging full of sweet data, they are physically disconnected, and put into storage.
Software engineer Dave Dan den Eynde has argued that Amazon’s process is then “simply a matter of keeping enough hard drives attached and filling up as required to handle the upload demand.”
The disks are then stored, probably in duplicate or triplicate, in some kind of secure cabinet, likely airtight and cooled. When a user requests access to their data, the hard drive is then retrieved, probably placed in a queue, and reconnected to the system to be accessed. There could even be a process of checking old hard drives for errors and ensuring the integrity of data if they haven’t been accessed for a long time.
As to whether this whole process is done manually, or by an army of robots all crafted in the exact image of Amazon founder and CEO Jeff Bezos, we have only speculation to guide us.
Avere CEO Bianchini was banking on this hypothesis, anyway.
“Some people think it’s discs they just turn off,” he told us. “So imagine that when you archive it, a robot goes out, and hits the power button on the box.”
Rebecca Thompson added, “it’s like hitting snooze. It’s inactive for a period of time, and just powers down.”
However, don’t jump to any conclusions. Bianchini expressed no little doubt about whether Amazon could recoup all of those costs just by powering down the drives.
“You have to think about the economics of it,” he said.
The search goes on
So there you have it: one commentator’s humble hypothesis, the equivalent of an out-of-focus photograph through the mist.
Even now, the mystery is far from solved.
Apparently, though, we’re not alone in wondering. Bianchini told us that “even in Amazon, they often say, ‘you tell us’. Once, one of the guys in the room said, ‘to be honest, I don’t know’.”
Images: Flickr: (Scott Ableman; Andy Lederer; Jisc; Kwong Yee Cheng; hugovk; Christopher.F Photography; y3rdua)Leave a comment on this article