Skip to main content

GDPR - Data, Individuals, and Deletion

(Image credit: Image source: Shutterstock/Wright Studio)

The European Union’s General Data Protection Regulation (GDPR) has been the hot talking point of the IT Industry in 2017. Designed to protect individuals’ personal data and facilitate the exchange of information for businesses that operate in the EU, GDPR outlines new requirements for data collection and processing that include hiring dedicated Data Protection Officers (DPOs) to safeguard the personal information of citizens. This is one of many additional costs that the legislation will bring, with Dimensional Research reporting that GDPR spending is expected to go into six figures.

It’s clear that when GDPR does come into force in late May 2018 it will cause a number of headaches for the IT Industry. Among these will be significant challenges including data identification, categorisation, management, and storage, which at its core is a question about database technology, design and structure.

Understanding the role the individual plays in your data

A vital area where companies can harness GDPR is to use the underlying IT changes required for compliance to get customer data in better shape and give them the chance to build a 360-degree view of their employees, customers or citizens. This then becomes a way of creating new revenue-generating applications and services for the business, while boosting customer satisfaction.

By gathering and identifying all of the personal data an organisation has on any individual, how and why this data is being used, and commonalities between data sets, organisations automatically gain valuable insights into the touch points for every individual. This can be leveraged to give customer service, marketing and sales teams a joined-up view of customers and prospects - a ‘golden record’, if you like, of everything relating to a customer (or in the case of a B2B organisation, individuals working for each customer).

The quantity of types of data can be enormous. Each golden record may include behavioural, social media, transactional, descriptive and product/service data taken from multiple sources including CRM systems, analytics databases that record user click-through/search, web site registration systems, fulfilment systems, call centre audio records, marketing databases, LinkedIn and more.

With a 360-degree view of each customer, organisations can increase revenues and reduce churn by being better able to identify and manage customer interactions and target individuals with more tailored, contextual offers across multiple channels. Plus, customer satisfaction can be enhanced by giving customer-facing representatives all of the customer-specific information they need to respond to a customer’s request or complaint accurately and quickly.

Many organisations will have implemented some form of golden record for their customers. However, this will typically be restricted to a few important attributes about the customer (name, address, etc.). Nonetheless, the regulation calls for all data on a customer to be tracked. Correspondingly, many organisations have realised the benefit so-called ‘big data’ brings, i.e. considering a complete data set when looking at analytics and insights. GDPR offers the opportunity to extend existing golden record systems by including the entire data set for a customer.

Keeping track of the pieces 

We would all like to believe that customer data is kept in neat folders and easily managed, but in reality that proves to not be the case. One of the new provisions of GDPR is the requirement for data erasure, both for consumers and business clients. This creates a difficult question, if one of your customers requested you deleted their data, how certain are you that you could do it? If a customer withdraws consent for marketing, for example, it is important to delete not just the appropriate data from the marketing system, but to ensure overnight batch updates don’t then flow that same data back into the system.

A recent study by CrowdFlower found that 60 per cent of data scientists’ time is spent “wrangling” or “cleansing” the data to make it fit into their database provider, while 80 per cent of respondents said this aspect of their role is the least enjoyable part of it. As the provisions around data erasure and control increase, unless companies modernise their systems to manage this task, they will be asking their employees to spend even more time doing tasks they do not enjoy.

With systems added and layered on top of each other over the years, many companies face the challenge that no one system has all the data in one place, or even knows all about the data, creating the challenge of diffuse silos of data that need to be accessed. How can that company then ensure that they are complying with the GDPR with these sorts of infrastructure challenges? They can’t.

For smaller companies this challenge is less pronounced as one or two trusted systems can process and manage consumer data, but the larger the network, more complex the enterprise, and the more peripheral devices, the greater this problem becomes. Any company with multiple teams accessing and moving data for multiple uses faces the risk that some will be misplaced. If you include things like IoT or remote devices that gather data, then this problem becomes amplified.

What can be done? The ideal situation is for a company to create a central database that brings all its data into one place, secures it, links it all together and understands every document and item submitted to the system, as well as the data linked to it. This means that an erasure request can delete not only the data itself but the items linked to it, and check itself to know they are gone. This task is monumental if you are working with legacy systems that can only understand data structured in one way, and is an argument for modern NoSQL databases that can handle multiple forms of structured and unstructured data.

It is important to understand that the original systems are not necessarily to be decommissioned. Many can be made GDPR compliant, and removing them would create unnecessary business risk. However, by consolidating data (or at least the metadata - i.e. knowledge of what data each system contains, and, importantly, exactly to whom it pertains) into a single, central place or data hub, it becomes possible to manage the myriad of disparate datasets.

The challenge of owning data

While data gathering and collection was initially seen as an entirely positive thing, opinion has turned as stricter legal requirements are placed on storing and holding the data. Companies face the challenge of a large volume of data stored in different silos, and not all understood by the systems that are responsible for processing it. It is vital that companies facing these challenges of how to identify and process an individual’s data and how to guarantee data erasure understand the scope of the issue.

Companies across the EU are now realising that neither of these tasks is straightforward, as long as they maintain a reliance on legacy systems. Older IT systems can lead to data being left on individual devices and not recorded, databases that require data structured in specific formats eat into data workers time.

There are only two solutions, either stop collecting data and completely delete any files that cannot be accounted for, rapidly downsizing your data burden, or install a system that is capable of parsing all the data available in all your systems and making it available to be audited and controlled.

Christy Haragan, Global GDPR Lead, MarkLogic
Image source: Shutterstock/Wright Studio

Christy Haragan
Christy Haragan is the Global GDPR Lead at MarkLogic.