Skip to main content

Dealing with data migration: how to choose the right fit for your organisation

(Image credit: Image source: Shutterstock/alexskopje)

The migration of corporate IT infrastructure to the cloud is accelerating day by day. A survey by 451 Research projects that by 2020 about 60 percent of all enterprise workloads will be in the cloud. But migrating applications and data from a corporate data centre to a cloud platform presents a significant challenge. 

The migration may be mandatory because of a merger or acquisition, in which data from another organisation has to be migrated to a new or existing environment, or because your business segment has been sold, which requires you to migrate storage elsewhere. In many cases, migration is voluntary in a quest for better service delivery. Cloud services can cut costs, increase flexibility and in some cases, ensure better service performance 

With the amounts of data being generated rising at exponential rates, the question of how to migrate data from an organisation’s data centre to its new home in the cloud has become particularly pressing. Some published reports cite that it would take about 120 days to migrate 100TB of data using a dedicated 100Mbps connection. Unless a data migration is well planned and well executed, it could cause significant disruptions to an organisation’s workflows. As Rebecca Hennessy, marketing head at Experian Data Quality puts it, “Without a comprehensive data migration approach, any planned improvements for innovation, performance and growth, can be severely delayed, or worse, derailed completely.” 

This is especially the case when attempting to migrate a production environment to the cloud. In such cases, extreme care must be taken to ensure that operations are not disrupted by the migration process. And that means choosing the right migration strategy for your company’s particular circumstances. 

There are three major approaches to data migration: big bang, phased, and parallel. In planning a migration project, the first step is to determine which of these approaches will provide the best opportunity for a successful outcome. Let’s take a look at each one. 

The big bang approach  

The frustrations that are inevitable with any migration are intense, but you get them out of the way in a contained time period (hypothetically speaking). The costs tend to be lower, and you avoid dealing with intermediary solutions or using two operating systems at once. 

This approach often means the migration is done over a single weekend. When users log in at the beginning of the next week, they log into the new system, and the old one is completely offline. This avoids having to run both the old and new systems simultaneously. Since no production operations take place in the interval between shutting down the legacy system and bringing up the target system, the necessity of synchronising the two systems is eliminated. 

However, because of that interval during which both the old and new systems are necessarily offline, the big bang approach is only suitable for businesses that don’t require their systems to be online 24/7. Also, since there is a specific and limited time window for the changeover to be accomplished, any glitches that occur during the migration process could have a severe impact on the company’s operations if that window is exceeded. 

The bad news about big bang migrations is that they are high risk. Because of these exposures, the big bang approach is considered relatively high risk, and works best when the amount and complexity of data to be migrated are small. Failure can lead to long periods of downtime and intense frustration as users are corralled into a new system all at once. Big bang migrations can work well if companies are moving only small amounts of data, or if they’re migrating data from only a few offices. A large transition performed big bang style can lead to major interruptions, which most organisations can’t afford. 

The parallel (or parallel run) approach 

The new system is installed alongside the old one, and both operate in tandem during the transition. Updates are posted to both systems until the migration is complete. Once it has been validated that the new system is functioning correctly, the old one is turned off. Parallel migration mitigates risk somewhat since the old environment stays functional while the parallel environment is established. 

The advantage of this approach is that current production is not disrupted, and migration issues can be fully dealt with before the target system takes over. This is the least risky of the three strategies because, in the event of problems with the new system, you can switch back to the legacy system. 

Establishing and managing two complete environments at once, however, gets expensive, in terms of both infrastructure usage and personnel costs, so migrating everything in this way gets cost prohibitive very quickly. 

The phased approach 

Also called iterative migration, this approach means data is migrated in small increments over time, on a per-module, per-volume, or per-subsystem basis. As each increment is transferred to the target system, bugs can be worked out and any required user retraining accomplished in small chunks, rather than having to be done for the entire system all at once. The result is less risk than with a big bang migration, but with a much-extended changeover time frame. Because of the longer time required to complete the migration, costs can be greater.  

You can migrate one or a few offices at a time, or you can start with applications that have few or no interdependencies. Phased migration gives users a chance to get used to new ways of doing things. It can also be more complex to manage as you bridge old and new storage to keep applications functional while you transition.  

One of the best ways to insure a smooth migration with the phased approach, is to have a per-volume remote mirroring, which is also a great disaster recovery and backup solution. It creates a natural, seamless phased migration. Because replication is snapshot based, only data that’s changed gets replicated, and only the most recent change is synchronised, which saves bandwidth and ensures minimal service performance impact. When all volumes are migrated to the cloud, you can start running your new workloads there. 

Good planning is at a premium with a phased migration, since dependencies between modules must be thoroughly mapped out in advance so that modules don’t become “orphans” in either the legacy or target systems. In fact, phased migrations, is the option most frequently used today. Dylan Jones, editor of Data Migration Pro, notes that their recent Data Migration Research Study indicates that 62 percent of migration projects use the phased approach.  

The important thing to understand are all the advantages and inconveniences of each approach, and at the same time, realising that you need to choose the one most suitable for your organisations’ specific needs. The truth is, most migrations involve some mix of big bang, parallel and phased, but the right balance depends on your budget, your timeframe, and your risk threshold. 

Kerry Telling, Sales Director, Northern Europe at Zadara Storage 

Image Credit: Alexskopje / Shutterstock

Kerry Telling
Kerry Telling is sales director, Northern Europe at Zadara Storage, a provider of enterprise-class high-performance, high-availability and scalable cloud storage, a cost-effective pay-as-you-go service, on-premise and in the Public Cloud.