• UK / United Kingdom
  • ICO
  • Search
  • Privacy
  • Europe
  • Data
  • Government
  • Processor
  • Iron Mountain
  • Virtualisation
  • Where to begin with Deduplication


    16 April, 2008, by David Galton-Fenzi

    Additionally, the lifecycle of primary data can be fleeting (minutes or even seconds) so going through deduplication may be an unnecessary process.  As a result, today, with a few evolving exceptions, byte level deduplication is aimed at the backup environment.

    Another key option to consider is where in the data centre we implement deduplication?  This doesn’t sound too important, but it is a raging argument among the vendors in this part of the industry.

    Some approaches have implemented deduplication for backup with a software ‘agent’ loaded onto each application processor which undertakes backup.  This spreads the load of the deduplication processing requirement across the processing power of all the servers involved – but crucially must interact correctly and effectively with the existing backup software packages loaded onto the servers. 

    The upside of this deduplication implementation at source is that the process is completed before any data is sent to the storage devices, minimising the data transfers between server and storage.

    Article continues after advert

    The downside, is that encountered by any agent based strategy, the agent must stay compatible with server software.  This means that any software upgrade or change on any server creates a potential for incompatibility and adds to the management task for the server administrators.

    The alternative approach is to have a dedicated platform in the backup path which handle deduplication ‘on the fly’.  This effectively centralises the process. 

    The benefits here are that the platform, not the servers, delivers the processing power for the deduplication and because it requires no changes to the server software, it is effectively transparent to the user.  Some storage vendors are taking up the idea of embedding these functions in their storage devices – though none appear to exist yet. 

    In many ways this endorses the in-line platform as the most elegant solution, because all they are doing is maintaining the in-line dedicated platform, but locating it in the storage device.

    Whichever approach eventually becomes the dominant implementation, as the data deluge continues to accelerate, deduplication will rapidly become a core element of any data centre’s storage strategy. 

    It is not only the storage capacity savings that are attractive, but also the support deduplication can offer for compliance (only one instance of a file makes it easier to manage, protect and delete as required) that will continue to drive this market.

    Tags: Business Continuity, Data Management, Information Life Cycle, Information/Data handling
    Posted by
    David Galton-Fenzi
    on 16 April, 2008
    ITProPortal.com - Sponsored Section

    Featured Content

    1. The New Voice of the CIO. 158 CIOs in midsized businesses across 31 countries reveal their insights and vision for enhancing competitiveness over the next five years.

      Download Document

    Customer Case Studies

    1. How a wine wholesaler improved the flow of information
      Download full case study
    2. The server that made an entire university smarter
      Download full case study

    Videos

    Connecting in a smarter planet:

    Latest Tweets





     





    News Now Logo




    Forgot your password?