Governments around the world are hoping that the behavioral changes seen as a result of the Covid pandemic will help to kick start widespread green economic recovery. Businesses everywhere are making their own optimistic commitments. United Nations Framework Convention on Climate Change, (UNFCC) reports substantiate this. They have announced that levels of commitment to reaching net zero emissions has roughly doubled in less than a year, with 2040 being a common target goal for zero emissions. Around 22 regions, 452 cities, 1,101 businesses, 549 universities and 45 of the world’s biggest investors are pushing for a green recovery. Amazon is among them and now running TV advertising to publicize its carbon free goals.
Much of the current discussion around achieving carbon neutrality focuses predominantly on cutting down on the consumption of plastics, reliance on fossil fuels, adopting sustainable production techniques, improving biodegradability of packaging materials and so on. These are clearly really important but so too is cutting the energy emissions waste associated with data centers - especially among the predominantly service-based, ‘information industries’. There are now well over 8 million data centers globally according to Statista and they consume vast amounts of electricity - generating carbon emissions that dwarf the airline industry. Knowledge or information industries might not have the same obvious sustainability concerns as manufacturers and retailers, but they are no less polluting when it comes to storing their data.
The reason why is because one thing almost all organizations have in common is what Gartner describes as ‘dark data’. These are big data information assets that are collected, processed and stored during regular business activities – and especially in relation to digital transformation programs - but are generally useless for other purposes. For example, analytics and business relationship management. It is rather like the dark matter in our universe, which CERN estimates to comprise around 27 percent of all matter, except that the concentration of dark data is much more prevalent. Within a typical organizations’ universe of information assets, experts are suggesting upwards of 50 percent of what’s in the data center is dark data. It is the organizational equivalent of hoarding, with few having a strategy or automated processes in place to understand what is being stored and manage data across its lifecycle.
Industry 4.0 era
Managing dark data is a problem because data centers require vast amounts of electricity. In 2020, energy consumption of data centers is expected to account for 3.5 percent of total worldwide carbon emissions and is expected to grow to nearly 40 percent by 2040. By 2025, they are expected to consume 20 percent of the world's electricity - more than any other sector. It is equivalent to what the Organization for Economic Co-operation and Development (OECD) countries’ food, iron and steel, and paper industries combined are currently consuming. Organizations are focusing on cutting their use of red diesel and plastics, but what about the environmental cost of their data? Poor data management is a serious (and entirely avoidable) waste problem that’s growing at an exponential rate.
Consider this. In 2010, IDC estimated that 1.2 zettabytes (1.2 trillion gigabytes) of new data were created that year - over 40 percent more than a year earlier. At the time, they predicted levels of new data creation would reach 35 zettabytes (35 trillion gigabytes) in 2020. It must have seemed a huge number at the time but it was way off, reaching that level two years early. IDC has now revised up their 2020 data creation prediction to 175 zettabytes (175 trillion gigabytes). That’s a 99 percent increase.
It is not exactly surprising to learn so much data is being created. We are in the middle of the ‘fourth industrial revolution’, a ‘big data’ era where organizations everywhere are investing in digital transformation, Internet of things, cryptocurrency investing, AI, blockchain, automation, e-commerce, online wealth management e-banking, telemedicine. Data lies right in the center of all these strategies and coupled with these new applications for data comes an increased compliance requirement.
Although industry regulations vary, large amounts of transactional data must be retained for compliance purposes, often for decades. In a world where the volume of data is quadrupling every five years, this just adds to the environmental and financial cost of managing its ongoing storage. In fact, research has shown that only a fraction of the data organizations hold in their systems – around 10 to 15 percent - is actually being used. The rest is legacy information, much of it totally redundant.
Practical steps forward
So returning to those original calculations, if 40 percent of current carbon emissions are coming from data centers which are consuming up to a fifth of electricity; and if 50 percent of the data stored there is dark data, with only 10 percent actively being used, that is a lot of wasted energy resource. Reducing this would have a significant positive effect on organizations’ net zero targets. What can be done to minimize this wastage and only store the data that is actually needed?
For an average midsized organization that holds 1000TB of data, the cost to store non-critical information is estimated at more than £550,000 annually. These estimates reflect what we are witnessing amongst our clients, many of whom are actively focused on reducing their carbon footprint. For example, a well-known drinks manufacturer has been working with TJC to review their enterprise data and reduce unnecessary energy consumption through automated data archiving projects. So far, this work has meant them being able to identify 55TB of dark data that could be archived, reducing energy consumption by 149 percent. Although we all appreciate the financial arguments are only one aspect of the issue, in monetary terms alone, this equated to them saving over Euro 2million.
It is not surprising that the UN is reporting such a huge increase in organizations making public announcements about their decarbonization plans. 80 percent of consumers say they most admire brands that demonstrate a commitment to sustainability. But if organizations and governments really want to prioritize achieving net zero targets by 2040 or earlier, much greater attention should be directed at the data they are consuming energy to store, with information lifecycle management needed to reduce waste at all levels.
What practical steps can organizations be taking now to cut the energy waste associated with data storage?
- Complete a full data audit to identify what data exists, the lifecycle of that data and consulting with stakeholders to establish priorities for DVM.
- Establish an Information Lifecycle Management (ILM) strategy, with clear policies for ongoing management and retention of data taking into consideration regulatory compliance issues like GDPR
- Identify a way to automate the data archiving and decommissioning process as an ongoing sweep in the future
- Implement reporting to monitor the long term ROI of the ILM strategy, the gradual reduction to data TCO and the positive impact on carbon reduction goals.
Mani Singh, Project Manager and SAP ILM Consultant, TJC UK & Ireland