Is your test data GDPR compliant? Key strategies to adopt GDPR regulations for testing

With the General Data Protection Regulation (GDPR) finish line set for 25 May 2018, companies are barely left with 6 months to comply with new requirements for EU data protection. Businesses cannot afford to assume that these regulations do not apply to them. No matter if your company deals directly with consumers’ personal data, or indirectly through production information, GDPR is applicable to all.   

Test data management (TDM) is an area that needs GDPR attention. Though essential in bringing efficiency to data processing and testing the quality of deliverables, TDM is open to vulnerabilities around regulatory and organisational standards, particularly as current compliances are not as stringent as GDPR. 

When production data is routinely being copied to a non-production environment for testing, companies must be able to ensure that this customer data is secure while they are improving their internal processes and efficiencies. GDPR is set to have wide-ranging implications for the type of data which can be used in non-production environments, and organisations will need to understand the nature of the data, , who is using it, and must be able to restrict its use to tasks for which consent has been given. 

Though there are various measures, including subset and masking, to ensure personally identifiable data (PII) are encrypted, below are some guidelines to how businesses can ensure their testing data is GDPR compliant: 

How GDPR impacts Testing 

Production data cannot simply be copied as-is. Amongst the new regulation introduced by GDPR is the right of restriction on the use of personal data information. If production data is sourced for testing, data managers need to use anonymization techniques, applying to all personal identifiable information, and this process must be irreversible. This emphasizes the need for good documentation of data flows, data models, and adequate test data profiling. If there are any existing anonymization techniques in place, organisations need to take stock of their current masking techniques and determine whether additional controls are needed. As GDPR stresses the need to safeguard data that gets transferred to countries outside the EU, organisations must ensure a purge mechanism to erase any requested data from all data sources once the testing is complete. 

Key principle guidelines to ensure your test data & testing are GDPR compliant 

  • Well-defined documentation of personal data information in all test environments 
  • Effective data discovery to understand and unearth sensitive data information 
  • Implementing the TDM process for the entire data life cycle that includes profiling, sub setting, masking, provisioning and archiving data in test environments 
  • Ensuring an irreversible “on-the-fly” data masking process on production data to a centralized repository 
  • Permission and alerts in place for data exports and access outside the region, as this is restricted 
  • Prevent access to personal data from unauthorized access points 

Best practices to ensure test data in non-production environments are GDPR compliant 

1. Awareness throughout the organisation: First and foremost, businesses should be aware and should understand the existing challenge of being GDPR compliant. Most of all, be aware that EU data-handling companies must be GDPR compliant by May 25, 2018. With a complex and time-consuming process ahead, to ensure every piece of data is secured in the correct way, now is the time to address the process. Compliance involves data security, IT & cybersecurity protection, and restructuring business processes to be GDPR compliant. Businesses should understand the existing data landscape specific to personal data within their organisation, and be able to identify sensitive data underlying within their application database.   

2. Formulate your GDPR strategy: Creating a robust strategy to tackle GDPR issues is the next step. Form a core governance and execution team to work on delivering GDPR solutions. Comprehensive masking rules are essential to adhere to regulations and compliance. This masking rule should be an irreversible and on-the-fly process, incorporating all the required rules. Next, devise a strategy on how to manage both masked data and synthetic data for different testing needs. The aim here is to minimise dependencies on masked production data over the next few years. 

3. GDPR for people, process & technology: Ideally, test data management should have a dedicated GDPR team to understand and tackle challenges caused during the entire data life cycle – through profiling, subset, masking, provisioning, and building repositories of data. With strict data version controls, and a centralized data access for relevant test data stakeholders, the team should be able to adopt a better framework.   

4. Adopting a synthetic data framework: Creating an enterprise-wide synthetic data generation framework means that the team would be able to build data sets based on models created. Synthetic data generation should also follow business rules and data models of testing scenarios for different environments. Synthetic data can be generated using different techniques, by generating data based on a variety of models – namely, the production database model, test scenario model, business process model, random data model, test process packs, and domain based data model. We can leverage specific tools to generate the required data. We recommend creating service and API virtualisation techniques where you can mimic the expected data set response and complete the testing process. This will reduce the waiting time in getting required data from external services.  

5. Data audit & control mechanisms: Lastly, having a regular database audit and protection will minimise exposure to the system from external users who should not have access to this personal data. This will also help in overriding any application-level security features that might leave the organisation to a data breach. With so many different global tooling and technology partners to support your clients and customers, partnering with trusted technology solutions will help secure the system. GDPR compliance should be an ongoing process and not one-time solution. Any new process, automation or compliances should support both existing business as usual processes and new challenges.   

Conclusion 

Production data are no longer consumable in test environment without proper anonymization. If the organisation decides to implement anonymization, it should implement an irreversible process where the decrypted data cannot be traced back to its original data, it should do a consistent masking with group of fields across the database. To implement this approach, there should be system and processes in place to have intelligent data discovery on sensitive information, implement both dynamic and static masking techniques, have reusable and enterprise wide synthetic data generation framework. Organisations should leverage the existing masked production copy mechanism and have a long-term strategy to adopt more and more synthetic data generation for the subsequent releases.    

Sathiya Narayanan, Senior Consulting Manager with Data Assurance practice at Wipro Limited 

Image Credit: Wright Studio / Shutterstock