Will Everyone Move Their Storage Into the Cloud?

After the success of cloud storage services in the consumer space, why aren’t enterprises adopting on the same scale? What can be done by technology vendors and service providers to accelerate their adoption?

Firstly we need to understand why enterprises would even contemplate cloud storge. Quite simply, operating on-site storage is expensive. Not only do you have the capital expense of purchasing the storage array(s) in the first place, you typically also have a raft of recurring operating expenses including maintenance renewal fees, software licensing fees and data center costs.

Remember to double, perhaps triple, it all for a resilient multi-site solution! In addition, you have the pain of right-sizing the storage infrastructure; over provision and incur unnecessary capital and operational expenditure, under provision and you are likely to face capacity, performance and budgetary issues.

On paper, cloud storage is a very attractive proposition because ultimately there is no infrastructure to manage and used in the right way, with the right data and traffic profiles, it can be relatively inexpensive. As with the majority of cloud services you only pay for what you use and you get as much capacity as you need – on demand. As part of the default pricing structure, most cloud storage services also create geographically redundant copies of your data automatically. UTOPIA right? Well, the reality is that while there is indeed huge potential for cost reduction using cloud storage, it hasn't been heavily adopted by enterprises for a number of reasons:

  • - Security and Compliance: Data security is critical for all but a few large enterprises. Although cloud storage providers are focusing a significant percentage of their time and budgets to secure customer information and provide location specific options, there is still a perceived risk that data can be compromised. For regulated industries or companies working with sensitive information, or even for those who are simply risk averse, this makes cloud storage a non-starter.
  • - Performance: Unless you move your applications close to the cloud storage, which can be hard to guarantee even using the same service provider, data is accessed over the Internet, with far less bandwidth and much higher latency than a traditional storage network. On top of this, storage specific protocols are replaced by internet protocols such as HTTP and TCP/IP so applications simply don’t work well across the Internet when they have been written with the expectation of local storage performance. Even if you can get the required network bandwidth, latency and protocols, cloud storage providers typically don’t offer performance SLAs at the I/O level.
  • - Data Protection and Availability: As mentioned previously, cloud storage providers automatically keep multiple copies of data in geographically separate locations. Whilst this is great for a full site or physical equipment failure, it doesn’t protect against user or application errors such as accidental deletions or data corruption. All user actions or corruption is repeated across all the physical copies of the data. With on-site storage, enterprise arrays typically have snapshots to guard against such issues.

When your data is stored in the cloud and your provider has an outage what do you do?Enterprises have real concerns regarding service lock-in and operational integrity? Enterprise IT managers have to mitigate these risks and make sure that they maintain data availability; cloud storage is seen as too big a risk.

  • - Application Compatibility: Enterprises can run hundreds, if not thousands of different applications on their servers. These applications have typically been written to talk to storage using industry-standard block and file protocols such as FC or NFS. Cloud storage services communicate via vendor specific APIs such as ReST. Therefore, most applications have to be rewritten to utilize the APIs provided by the cloud storage providers, and to work within the limitations of the APIs. This simply isn't practical as mainstream applications running in the enterprise (e-mail, file shares, archiving, etc.) cannot be easily retooled for the cloud.
  • - Data Tiering: Organisations are notoriously bad at aligning different data types to the right storage technologies and solutions. It’s not uncommon to see large amounts of test/dev, user and archive data sitting on expensive, high-end FC arrays. Simply put, many IT organisations are often not ready to align lower tier data to cloud storage.
  • - Confidence: The first thing cloud providers need to do is provide a level of transparency to instill confidence in the enterprise market. For example, by exposing their technologies, processes and commercials the service providers show they have nothing to hide. This is not consumer cloud and enterprises will want to see more than just SLAs, they want to know they are backed by enterprise class infrastructure and processes.

Ultimately, most of the issues highlighted above will be addressed in the next few years. This coupled with organisations refreshing legacy apps or developing new apps in the cloud will undoubtedly lead to an increased adoption of storage out of the cloud.

Perhaps one of the most interesting cloud storage concepts to hit the market in recent years is the development of on-site appliances that connect back to the cloud storage services. These appliances sit on the enterprise SAN and provide the interface back to the provider’s APIs using data encryption. The cloud storage therefore appears on the SAN as if it was on-site storage removing the need for application redevelopment. Through pre-defined policies, only the correct types of data will be pushed out to the cloud. What about the bandwidth and latency issues – I hear you cry? Typically these appliances also have huge amounts of cache, de-duplication technologies and excellent data compression so performance can be comparable to local arrays. Finally, by locating such an appliance at multiple sites, in the event of a Data Centre or WAN failure the remote cloud storage can simply be accessed from another site – removing the need for complex and expensive replication solutions.

To give an example of how significant these technologies could be, NetApp acquired Bycast earlier this year, which became their StorageGrid solution. Another one to watch is Cirtas who, although only a start up, not only boast an impressive set of features but also have a $10m investment from Amazon – probably the largest cloud storage provider with their S3 service.