Skip to main content

Data – the new ‘oil’

(Image credit: Image source: Shutterstock/alexskopje)

Several high-profile data breaches have recently hit businesses, such as British Airways and Air Canada. Data breaches are increasing in number, spiking 75% in just two years, according to the Information Commissioner (ICO), while complaints about them have rocketed 160% since the General Data Protection Regulation (GDPR) came into force. 

The GDPR stipulates that data breaches must be reported inside 72 hours of detection, or huge fines (€20 million (£16.5m) or 4% of worldwide turnover) can be levied. The GDPR also enables consumers to request what data a business stores on them and allows for them to ask for it to be given back to them and/or deleted. 

The aftermath of a data breach could see the target company end up with many thousands of consumers concerned about their exposed personal data, and likely to leverage the GDPR provision “Right To Be Forgotten” provision. These requests could skyrocket; and impose a massive, time sensitive operational burden on customer service and governance teams. As businesses need to process these requests within a 28-day period, they must be confident the data they store for thousands of consumers can be quickly located; and all variations of it identified and tagged for action. 

If this isn’t challenging enough, they must also ensure that important business information can be erased legally. For instance, if an airline deletes personal data relating to a traveller, what impact does that have on the same customer traveling on partner airlines? Companies cannot simply delete data, they must analyse its importance by business function. This is a complicated balancing act to achieve, especially without impacting operational efficiency.

The unstructured data dilemma

For the last few decades, businesses have been hoarding customer data; but unfortunately invested little in how to manage it. The result has been a string of high-profile data breaches. However, migration to the cloud, automated data retention and the GDPR have forced businesses to take a much closer look at their stored data, who can access it, how it can be monetised/leveraged and how it should be managed. 

While GDPR compliance is challenging – it also represents a fresh opportunity for businesses to really understand where their data is stored, its business value, risk level and the security that needs to be applied to it. One of the biggest challenges to achieving a balance between utilising data for business purposes and regulatory compliance is a fundamental issue – businesses don’t really know what data they hold, where it is, its functional context, who has reviewed it, or copied it and even if they can legally delete it. 

Most companies find it difficult to assess which documents comprise valuable details, such as financial data. It is even more complex to evaluate the sensitivity of a document, or to try and understand its business context. This is because most (70-80%) of the data a business holds is ‘unstructured’. This means it’s not part of a pre-defined data model and it’s not stored in a pre-defined way, such as rows and columns within a database. In simple terms, unstructured data is contained within business documents like PDFs and e-mails, among others, that are produced, and stored, during a working day. In a post-GDPR world, understanding unstructured data is now mandatory. 

Rather than this being considered a challenge; the journey to understand unstructured data should be seen as an opportunity to unlock significant benefits; such as ensuring up-to-date information is being used and that no time is spent reproducing documents that already exist, among others. However, these benefits can only be realised if the correct data discovery and management systems are put into play. 

Unlocking the promise of data

The process to understand unstructured data must be efficient. This is because there is just so much of it. For example, businesses of 5,000 to 10,0000 employees tend to store approximately 50-100 million documents. Manually attempting to analyse this tremendous volume of data could take a staggering 400 years’ worth of time and cost. If a business leveraged machine learning technology, it would struggle as it’s not designed to gauge the context of documents. This is also made even more complicated if a business has data spanning several different languages. 

Given the scale of data businesses store – it is obvious that the task of identification could seriously impact everyday operations. Artificial Intelligence (AI) technology can generate inventories of data automatically, with an unprecedented level of accuracy, in a rapid timeframe. This makes it the perfect foil for efficient data identification – as it will not impact operations. The AI solution’s ability to create inventory lists for data, with meaningful business categories such as HR and finance, among others, means that companies can understand what data they store, and are responsible for. 

Improved data management means a business can boost security, by reducing the quantity of data and increasing its quality. Being confident around the deletion of ‘toxic data’ means that if systems are hit by a data breach, its impact, and associated costs can be reduced. Moreover, the less data that is stored, the less irrelevant information is available, the fewer errors are made, and it is much more straightforward to locate what is being sought. This is obviously beneficial to GDPR compliance, but, in the long run, it also supports the improvement of business processes and boost organisational efficiency. 

The GDPR is triggering huge changes among businesses. More and more organisations are coming to rely on data identification and management processes to stay compliant and optimise data privacy by design. However, by simply using these tools, business can achieve so much more – uncovering the genuine value of data; and embracing it to improve business performance. 

It’s not just businesses who can achieve benefits here either, the GDPR means that customers have much greater control over their own private data, where it is stored and how it is used. The entry of the GDPR onto the statute book will help to holistically mature data management and, as a result, both data privacy and business operations will vastly improve.

Steve Abbott, CEO of DocAuthority (opens in new tab) 

Image Credit: Alexskopje / Shutterstock

Steve is the CEO of DocAuthority, a leading document control solutions company, offering organisations a security policy utilising AI to help discover and accurately identify unstructured and unprotected sensitive documents.