• UK / United Kingdom
  • ICO
  • Search
  • Privacy
  • Europe
  • Data
  • Government
  • Processor
  • Iron Mountain
  • Virtualisation
  • Where to begin with Deduplication


    16 April, 2008, by David Galton-Fenzi

    The result is that where files are constantly changing, the saving in storage capacity that can be achieved with CAS is fairly minimal.  So why do we have it at all? 

    The answer is – for archives. When a file is archived it is normally for long term storage and is likely only to be referenced rather than changed.  After all, changing the archives is like re-writing history. 

    This aspect of CAS is also a way to ensure that archived records are not tampered with (as might be a temptation in a company facing significant legislative challenge) as any change will produce a new identifier and will be seen as a changed file.

    This is where the second technique, byte level deduplication, comes in.  At this level the mathematical mincer changes, and this time it is looking for differences between files at a byte level. 

    Going back to our previous example of a document where a spelling or punctuation change has been made, the byte level deduplication would recognise and store only the minor changes that have been made to the original document. 

    Article continues after advert

    This is an effective approach to minimising the storage capacity consumed, but does not give change tracking such as CAS delivers.  However, where ‘live’ data is being used, this approach is far and away the most effective for an enterprise environment, but the challenge is that it consumes much more processing power to achieve.

    On the face of it, this would go a long way towards saving expensive primary storage capacity.  However, the reality is that in most primary storage environments the emphasis is on performance rather than saving disk capacity and any performance overhead (such as the mathematics to determine duplication) are seen an inhibitor to speed of delivery.

    Continued on next page Tags: Business Continuity, Data Management, Information Life Cycle, Information/Data handling
    Posted by
    David Galton-Fenzi
    on 16 April, 2008
    ITProPortal.com - Sponsored Section

    Featured Content

    1. The New Voice of the CIO. 158 CIOs in midsized businesses across 31 countries reveal their insights and vision for enhancing competitiveness over the next five years.

      Download Document

    Customer Case Studies

    1. How a wine wholesaler improved the flow of information
      Download full case study
    2. The server that made an entire university smarter
      Download full case study

    Videos

    Connecting in a smarter planet:

    Latest Tweets





     





    News Now Logo




    Forgot your password?