Data centres vs mother nature: Tackling winter crises

(Image credit: Image Credit: YouTube)

The winter months are a testing time for the IT industry. Back in 2015-2016, thousands lost communication as severe flooding hit the north-east of England, inundating various IT assets. Vodafone engineers worked tirelessly over the festive period to fix their Leeds-based data centre. 

And 2019 could be another testing year, as weather experts warn that the ‘real winter’ is set to kick in with the potential arrival of a new ‘Beast from the East’ towards the end of January. Some even predict it will be the coldest winter on record, with icy gales and a risk of heavy snowfall across the UK. Globally, big snowstorms have already caused major disruption in the US’ south-eastern states, with North Carolina declaring a state of emergency last month for snow up to more than two feet in some areas. As the winter season rolls on, it’s vital for data centre operators to be proactive and make adjustments to the equipment to ensure optimal performance. 

As most data centre operators know, data centres face static and backup power issues when winter storms hit. Since a majority of data centres are climate-controlled and generator-backed, these issues resolve themselves. That doesn’t mean that data centre operators and support vendors are completely off the hook for winter-weather related concerns though. Winter is a testing time and unpredictable weather patterns means operators must prepare for all types of scenarios.

Here are a few things that IT staff should consider in the face of another brutal winter season.

The role of IT leaders throughout a winter storm

Regardless of the weather condition, disaster recovery provisions should always be in place and ready to activate, and IT leaders should be a key stakeholder in their initial and ongoing development. There are key best practices when preparing your data centre for a winter storm: ensure generators are fully provisioned with oil and fuel, regularly test your generators, and clear roofing materials of ice and other winter debris. The most important of all may be to maintain a mirror copy of production systems in a second data centre. Every business unit should have continuity and contingency plans, and IT departments are no different. In fact, they will likely play roles in several unit plans, and they are vital to keeping operations running as smoothly as possible.

Another critical responsibility of IT leaders during winter weather conditions is to always keep customers informed throughout the duration of a storm. Customers should be aware of the preparations being made for the company before a storm arrives and receive frequent updates throughout. They should also be alerted to all of the potential data risks and downtime opportunities beforehand, so that they may plan accordingly. Communication should be continuous across all stages, with detailed notes or quick, digestible notifications to customers.

It’s times like these that make your company stand out and prove the high-level of trust and service you’re providing to customers.

Travel is unpredictable, so plan ahead

The biggest mistake that support staff and engineers make is assuming that travel is possible. Delivering a repair part to a data centre could be extremely difficult, if not impossible, in the face of blistering winds and snow causing whiteout conditions on the road. Aside from weather conditions, employees themselves may have conflicts preventing them from getting to a company’s data centre site, such as damage to their home or other health-related concerns.

Safety and employee well-being should always take priority, but it does mean that when travel to a data centre is deemed impossible, extended server downtime can be an issue. The most effective way to limit downtime in the case of inclement weather is to have “remote hands” or “remote access” capabilities in place.

Remote hands services tackle basic tasks like rebooting servers, disconnecting and reconnecting cables, and monitoring and reporting on indicators. When access to a data centre is impossible – whether it’s travel-related or because it’s located across the country – these systems come in-hand. They are important so that manual action can be taken to remediate any downed equipment. 

Riding out the storm and looking ahead

The passing of a storm doesn’t mean that the work is done for external IT staff. They will be responsible for assessing the aftermath and confirming any damage by conducting extensive, comprehensive tests and cross-checks across all data centre assets.

Looking ahead, organisations should consider infrastructure changes to create better efficiencies in the data centre in the event of a storm. One instance is by including the cloud in your business continuity planning or disaster recovery plans, which can also be a very cost-effective way to implement mirror systems or cold standby systems. The ability to maintain backup capacity without paying for the capital infrastructure for a complete mirror system not only protects your precious data from exposure to weather elements, as they are stored online, but is also a big cost saver.

It’s vital to always think of ways to ensure that critical data server infrastructure is up and running, maximising uptime regardless of the weather system. Weather patterns are unpredictable and can make data centre preparedness a challenge, but taking the time to ensure disaster recovery provisions are in place – even if this means considering infrastructure changes – will help your organisation to ride out a winter storm. IT leaders are crucial in this process, being both a key stakeholder in developing such continuity and contingency plans, as well as responsible for ensuring customers are alert and responsive to adverse conditions. Having remote hands capabilities in place for when travel to the data centre is too dangerous means downtime is avoided and staff safety remains priority. And of course, even when you’re no longer in the eye of the storm, comprehensive tests and cross-checks of data centre assets for any damage are vital. Taking these steps will ensure that your data centre can continue to operate safely and efficiently.

Chris Adams, President and CEO of Park Place Technologies
Image Credit: YouTube