Customer data is arguably the most valuable asset any modern business holds, but within the next year the regulations governing that are set to change dramatically. Irrespective of what companies and other organisations are using the data they hold for, they will be affected by the General Data Protection Regulation (GDPR) in May 2018.
This pan-European deadline is bringing the testing community to the forefront of data handling practices, as companies face up to the challenge of how to mask customer data or build accurate and useable synthetic data, while retaining referential integrity for testing new products or services. Overarching this is a requirement to ensure that individuals’ data is processed securely and that an audit trail is in place for compliance purposes.
While the penalties for non-compliance have been well-publicised, that awareness doesn’t seem to have had much impact, with a mere 19 per cent of UK CIOs currently having plans in place to deal with it.
Here we take a look at the major challenges presented by GDPR and the transformative role test data management can make in a new product/service development environment.
Most wanted: Test data management
It is certainly arguable that poor test data management represents the biggest risk to a business in terms of breaching GDPR legislation. Test data can be the ‘forgotten man’ when building business driven test scenarios, but from a GDPR compliance viewpoint it will need to play a central role.
Test data management covers a wide aspect of specific quality assurance driven disciplines that support all IT and Business test phases. It typically covers key quality assurance driven activities and includes:
- Targeting and creating non-production data sets that mimic actual data so that a business can perform accurate and relevant tests before releasing or updating a new service or product.
- Building synthetic data where it is not possible or acceptable to use ‘real’ data.
- Ensuring data can be shared across IT and business teams.
- Enabling data to ‘time travel’ to support complex business test scenarios.
- Planning effective backup and restore capabilities.
- Supporting build and deployment of test environments.
These activities, and more, will be essential to ensuring GDPR compliance whilst still allowing projects to function.
Consent and transparency go hand in hand
Data subjects will need to provide specific and active consent covering the use of their data, and this requires an update of most processes around the gathering and withdrawal of consent. Consent for third-party processing is also affected as the data owner is liable for data, wherever the data is handled. This will drive a need to update how consent is acquired and managed, necessitating product change and associated testing. Different test data sets, scenarios and combinations of consent will need to be verified, ensuring that all address consent correctly.
Companies will also need to define and then manage legitimate data use and length of storage time before archival and deletion. Testing will, again, be crucial and it is important to note that the regulator will require evidence of due diligence, and particularly in the case of exceptions around legitimate business uses such as pursuing outstanding debts or other business reasons that can process data without an individual’s agreement.
In the past a copy of real information will often have been used to test systems, but this is not tenable in the world of GDPR. Individuals need to give explicit and informed consent that their data can be used for testing, which is not something that can be baked into consent for other purposes – indeed, attempting to do so could be regarded a breach of the GDPR regulations in its own right.
New requirements, new risks
The GDPR legislation states that individuals have the right to data portability, a concept that allows customers to move, copy or transfer personal data from one service provider to another in a safe and secure way.
This relatively new concept will require significant testing, and ensuring compliance for data in flight will be a major exercise for organisations that have high volumes of live data in non-protected environments.
Through test data management, testers will have access to the data in a structured and readable format and be able to confirm that the original data has been removed from the ‘source’ system.
All of the above will drive new risk based test scenarios which in turn will impact how late phase testing such as User Acceptance Testing is defined, planned and undertaken. This will have an additional impact on the quantity of ‘Must Tests’ that will need to be executed within a UAT test window.
Additionally, with the fines for non-compliance so high, it is essential to ensure that any existing functionality is not downgraded or negatively impacted by new changes. In terms of ongoing assurance, this clearly increases the need for regression testing across projects. GDPR testing and test cases will now need to be added to your regression pack.
Is data masking vital for your business?
Testing will always require the use of either real or synthetic data. While the use of synthetic data may theoretically be preferable from a risk point of view, it is not always practicable and any real data will need to be anonymised. Because of this, organisations will need to build up data skills within their testing teams as they move away from using simple copies of live production data.
In data masking, testing teams use powerful tools to enable anonymization whilst protecting the real source data. Some tools enable data snapshots, where users no longer work on the database, but on a copy of the data that is isolated and anonymised. Other strategies are also available, for example dynamic anonymization where the result of a query is anonymised in real-time so there is no need to take a snapshot of the source data.
From the perspective of using synthetic data this necessitates skills in both data analysis, to understand the system-under-test, and data modelling to design an efficient data set. Technical skills will be needed to actually create the synthetic data, probably driving the need for specific data creation tools with all but the simplest data sets.
Additionally, new assurance processes and procedures will be needed for the team. These will be needed to ensure personal data is not exposed to persons who are not authorised to handle it.
Tools and data discovery
In terms of data discovery, 75 per cent of organisations said the complexity of modern IT services means they can’t always know where all customer data resides. A retail client recently conducted a discovery exercise and found terabytes of customer data over ten years old. Under GDPR, the retailer needs to be able to find that data and justify holding on to it for ten years. This will have an impact on Business as Usual and risk management within the organisation.
A similar example is the TalkTalk breach revealed last year, as some of the data that was breached was also ten years old. In May, another company that had been purchased by TalkTalk, a small now defunct regional cable and TV company, was revealed as the source of the leaked data.
To avoid non-compliance, documentation of the use of personal data in all test environments is necessary, including backups and personal copies created by testers. An understanding of all real data sources and the current location of data is key to ensuring that no real personal data is exposed to software testers, test managers, business users and other team members during software development, maintenance and test phases.
Some organisations have legacy, poorly supported IT systems, with unstructured data making it even more complex, especially if an organisation has emails on file relating to individuals, containing names, addresses, telephone number and contact information. Those responsible for GDPR compliance must be able to examine the databases and search for stored objects like email attachments with data relating to an individual. Unfortunately, finding the right data within gigabytes of information can be a hugely complex and time-consuming task. Testing teams can help search for the data using the same tools that would be typically used for automation.
GDPR is not just another regulatory requirement; the transition to becoming a GDPR-compliant organisation is a major undertaking and maintaining compliance requires an ongoing commitment and new ways of testing.
The regulation highlights a watershed moment in the treatment of personal data by all sizes and types of business, and will prove in some cases to be the major IT challenge over the next few years.
Ongoing and continuous testing of compliance will become a feature of every businesses internal roadmap, and a robust test data management strategy will be a core part of the puzzle. Indeed, developing this strategy will increasingly be seen as a necessary investment to guard organisations against the severe penalties for non-compliance.
Edge Testing Solutions offers Test Data Management and assessment of ongoing GDPR compliance services.
Dan Martland, Head of Technical Testing, Edge Testing Solutions
Image Credit: Wright Studio / Shutterstock