De-duplication in itself is easy to understand – optimised storage capacity usage by eliminating duplicated data. However the devil is in understanding the different technologies, techniques and implementations in the market and relating these to customers specific needs.
Instead of storing data multiple times, de-duplication enables the data to be stored once and uses that single instance as a reference. The techniques used to do this vary. For instance, we could look for complete files which are the same, and only when these are a complete match with each other, is a single instance created.
Alternatively we could look at files which are basically similar (for example revisions of a draft document) and create a single instance of a master file only saving the byte level differences between this and subsequent files. So which of these approaches is best? As always, the answer is not straightforward.
If we look at the first of these – working at a file level, rather than a byte level, there are well established techniques such as CAS – Content Addressable Storage. With this approach the contents of the file are put through a mathematical mincer and the end product is a unique identifier which is attached to the file.
If exactly the same file exists somewhere else in the system, the mathematical mincer will produce exactly the same identifier – indicating a duplicate file which can be made into a single instance.
Using this approach, every time a spelling mistake is corrected, or punctuation is added to a document, a new identifier would be created and both versions of the document stored.
Continued on next page Tags: Business Continuity, Data Management, Information Life Cycle, Information/Data handling
Hot Topics

Office web is the latest addition to Microsoft's Office business suite and is set to be the company's most revolutionary version.

Microsoft's 14th version of its award winning, multi-billion dollar cash cow business suite, is the company's most ambitious to date.

Spotify is certainly one of the most popular online music websites in the world which is a feat for a service that was officially launched only in February 2009
Featured Content
- The New Voice of the CIO. 158 CIOs in midsized businesses across 31 countries reveal their insights and vision for enhancing
competitiveness over the next five years.
Download Document
Customer Case Studies
- How a wine wholesaler improved the flow of information
Download full case study
- The server that made an entire university smarter
Download full case study
Videos
Latest Tweets

Comments