The Internet, and indeed every business, runs on data - usually lots of it. Thanks to the proliferation of cloud-based services we discussed in the first feature in this series, that data could be all over the place, too. Keeping track of all this data, and ensuring that it is secure, is a major logistical nightmare. As we explained in an earlier feature focusing on Big Data, interlinked information and its analysis is also the next big revolution currently brewing in Web culture. In this feature we examine the issues of data growth, and look at some of the solutions available.
Reaching for the cloud store
In reality, cloud storage shouldn't be about having some documents on a service like Google Drive and others stored locally, which is clearly a recipe for confusion. It should be about having your documents appear to be in the same place all the time – on whatever machine you are currently using – but also having them seamlessly and securely stored in a remote location as well. You could, of course, standardise on just one cloud service and specify that everyone in your organisation use this for all primary document storage. But this locks you into a single vendor, which may not offer all the features you need, in particular the level of service to suit a mission-critical application. You have also moved from multiple points of failure – that is, each individual user's machine – to just one. Placing all your business storage eggs in one cloud-based basket, even if the chosen service is very reliable indeed, presents some major risks.
To compensate for this, the service provided by, for example, Nasuni acts as a layer between your company and third-party cloud services, so you're not locked into a single one. A recent test by Nasuni put Microsoft Windows Azure ahead of the competition, beating Amazon to the top slot that the latter had occupied in the previous year's test. So the best vendors clearly do change regularly, and Nasuni performs these tests so that it can always use the best cloud vendors for its service. It essentially acts as a storage hypervisor for cloud capacity, so the client sees the same service no matter which cloud service is actually being used.
Services like Nasuni's make cloud-based storage act like regular network storage from the perspective of your users.
A storage hypervisor is a virtual layer that allows storage from multiple disparate sources to be pooled together and then apportioned out to users as required. For example, SANsymphony-V from DataCore Software can pool NAS and SAN provisions into one. It also offers a Cloud Gateway that will present cloud storage services as local iSCSI-attached capacity. It supports a range of private and public cloud solutions, including Amazon S3, AT&T Synaptic, Nirvanix, EMC Atmos, Windstream, Mezeo, Scality and others, as well as regular NAS. It also facilitates migration between vendors, so you don't get locked into any particular one and can switch if a different vendor offers a better deal. Virsto's software, on the other hand, is primarily geared towards making your existing storage more efficient using a hypervisor system, while Twin Strata's CloudArray provides an iSCSI or NAS interface to cloud services from over 20 providers.
The beauty of any cloud system is in scalability. If your company manages its own physical storage in a traditional way within its own data centre, adding space won't be immediate. Putting aside whatever timeframe is required by procurement at your company to obtain new hardware, there's also the process of rolling this out, which may involve service downtime or the need to transfer resources from old to new storage capacity. With a cloud-based service, you only need to place the order and extra storage can be arranged almost immediately. The Nasuni system also gets over the issue of potentially unreliable Internet connections by providing a local storage controller that can be accessed via standard protocols like iSCSI, CIFS and NFS. The controller, which can be actual hardware or a virtual machine, acts as a local cache of recently accessed data, but also ensures that the connection to the cloud storage is fully encrypted.
Watch out for the SLA
One of the key advantages of using multiple cloud vendors is that Nasuni can offer a surprisingly generous service level agreement (SLA), pledging 100 per cent availability, accessibility and security, although this isn't unique. Rackspace offers a similar 100 per cent SLA for its cloud storage, and Memset's Memstore offers 99.995 per cent uptime. But Google's Cloud Storage and Microsoft's Windows Azure have 99.9 per cent SLAs. This may seem like a trivial difference in potential downtime. However, it equates to about 45 minutes a month, which could come all at once at a critical moment. It's also worth noting that although subscription fee credits will be applied when service drops below this level, for Google's services the maximum is only 50 per cent of your bill for anything below 95 per cent uptime, whereas Rackspace and Dimension Data will credit up to 100 per cent of your bill, and Nasuni offers 10 days of free service for every day of downtime, up to a maximum of three months of credit in a 12-month period.
Always check the small print of your cloud storage SLA as the terms and conditions may be more onerous than you think.
Another consideration is whether the SLA stipulates any criteria before it kicks in at all. For example, HP's Compute Cloud and Amazon Web Services (AWS) offer 99.95 per cent uptime in their SLAs. This may amount to scarcely 20 minutes of downtime a month before credit rebates kick in, but both companies also have strict requirements for their SLAs to be in force at all. Amazon requires its customers to have applications run across two "availability zones" (AZ), which are essentially physically separate data centres, and HP requires Compute Cloud customers to span their services potentially across three AZs. Only when all AZs are down does the SLA activate. This forces customers to replicate their services across multiple AZs, which is more complicated and costly. This is definitely something to consider. In October, AWS' Elastic Block Storage (EBS) experienced degraded service, taking down some well known websites including Pinterest, GitHub, Gamespot, Reddit, Imgur, Payvment and AirBNB. EBS is a cloud storage service that generally works in tandem with Amazon's EC2 application services.
Private clouds on parade
So it would seem that third-party cloud storage hosting is not the complete Holy Grail some evangelists would want us to believe, which might push things back towards maintaining your own private cloud-based system. For example, IBM SmartCloud Virtual Storage Center links to an IBM SAN Volume Controller to provide virtualised capacity. It acts as a sophisticated storage hypervisor, and even allows volumes to be connected between physical sites. This increases fault tolerance, as a crisis at one site still leaves the connected sites unaffected. Microsoft's Service Center 2012 pledges to make easy the transition to a private cloud for compute and networking resources as well as storage. Resources can be delivered as services, so they can be requested, configured and managed through an interactive portal, allowing almost instant automatic expansion and contraction of provision as required. This is a much more efficient method for delivering services, as you don't end up with one department being under-provisioned whilst another is over-provisioned. Resources can be balanced between them as required.
IBM's Tivoli Storage Manager for Virtual Environments backs up your virtual machines, taking this performance-degrading task away from the virtual machines themselves.
Even if you keep all computing and storage within your organisation, there could still be issues. In particular, the growth in use of virtualisation means an enterprise server could be running numerous virtual machines, all of which require backing up and some of which may need restoring occasionally too. This is where IBM Tivoli Storage Manager for Virtual Environments comes in. Its vStorage backup server offloads these tasks so they don't need to be run within the virtual machines themselves. These are then sent to an IBM Tivoli Storage Manager server, and on to the main backup system. You can read more about how this works in this IBM white paper. The Hypervisor, or virtual machine, is a major factor in the ability to offer cloud services. By breaking the direct connection between the operating system and underlying hardware, multiple heterogeneous systems can operate side-by side, and are loaded as required.
A final consideration is the explosion in mobile devices. Pretty much every new phone is a powerful smartphone that can potentially create and edit everyday office documents, or even multimedia. Then there are the tablets that are starting to make the traditional notebook redundant for the business traveller looking to keep their hand luggage light. These devices can work well with data resident on cloud-based systems, but they will also potentially have some documents stored locally. Throw in new concepts such as Dell Wyse's “Project Ophelia”, which provides cloud access to MHL-enabled displays via a device the size of a USB memory thumbdrive, and you've got a truly confusing data and access device landscape. In the third and final part of this trilogy, we place cloud storage management in the wider context of business computing asset management, to see how it both fits into this and provides further solutions itself.
For more information on cloud storage head over to Tech-Beat.