Additionally, the lifecycle of primary data can be fleeting (minutes or even seconds) so going through deduplication may be an unnecessary process. As a result, today, with a few evolving exceptions, byte level deduplication is aimed at the backup environment.
Another key option to consider is where in the data centre we implement deduplication? This doesn’t sound too important, but it is a raging argument among the vendors in this part of the industry.
Some approaches have implemented deduplication for backup with a software ‘agent’ loaded onto each application processor which undertakes backup. This spreads the load of the deduplication processing requirement across the processing power of all the servers involved – but crucially must interact correctly and effectively with the existing backup software packages loaded onto the servers.
The upside of this deduplication implementation at source is that the process is completed before any data is sent to the storage devices, minimising the data transfers between server and storage.
The downside, is that encountered by any agent based strategy, the agent must stay compatible with server software. This means that any software upgrade or change on any server creates a potential for incompatibility and adds to the management task for the server administrators.
The alternative approach is to have a dedicated platform in the backup path which handle deduplication ‘on the fly’. This effectively centralises the process.
The benefits here are that the platform, not the servers, delivers the processing power for the deduplication and because it requires no changes to the server software, it is effectively transparent to the user. Some storage vendors are taking up the idea of embedding these functions in their storage devices – though none appear to exist yet.
In many ways this endorses the in-line platform as the most elegant solution, because all they are doing is maintaining the in-line dedicated platform, but locating it in the storage device.
Whichever approach eventually becomes the dominant implementation, as the data deluge continues to accelerate, deduplication will rapidly become a core element of any data centre’s storage strategy.
It is not only the storage capacity savings that are attractive, but also the support deduplication can offer for compliance (only one instance of a file makes it easier to manage, protect and delete as required) that will continue to drive this market.
Tags: Business Continuity, Data Management, Information Life Cycle, Information/Data handling
Hot Topics

Office web is the latest addition to Microsoft's Office business suite and is set to be the company's most revolutionary version.

Microsoft's 14th version of its award winning, multi-billion dollar cash cow business suite, is the company's most ambitious to date.

Spotify is certainly one of the most popular online music websites in the world which is a feat for a service that was officially launched only in February 2009
Featured Content
- The New Voice of the CIO. 158 CIOs in midsized businesses across 31 countries reveal their insights and vision for enhancing
competitiveness over the next five years.
Download Document
Customer Case Studies
- How a wine wholesaler improved the flow of information
Download full case study
- The server that made an entire university smarter
Download full case study
Videos
Latest Tweets

Comments