Skip to main content

Six ways CIOs can stop process mining becoming a privacy issue

digital business intelligence growth
(Image credit: Image Credit: Shutterstock/Sergey Nivens)

Process mining technologies help businesses uncover how their processes actually behave, unblocking bottlenecks, and identifying areas for process optimization. On the surface, for a business embarking on its digital transformation journey, there doesn’t seem to be any particular worry for CIOs – only benefits. Digging deeper, however, it’s clear that to truly reap the rewards of your investment, privacy concerns must be addressed to protect people’s data privacy rights.

One example is event logs, the first key step in process mining. Event logs store information – for example in a logistics setting, this could be the person who initiates the activity, such as logs an order, the time stamp when the order was logged, or data like the size of an order or its destination. Businesses use event logs to improve their processes based on tangible data insights, rather than guesses and assumptions. However, the nature of the data means that event logs inevitably contain ways to identify personal data.

It’s the deidentification of this personally identifiable information that can help CIOs safeguard privacy rights. The two common methods of deidentification are anonymization and pseudonymization. Anonymization provides the most stringent mechanism, permanently removing any direct identifiers of personal information – but the drawback is that it does impact the later use of process mining results. Pseudonymization means that the processing of personal data cannot be attributed to data subjects without the use of additional information. However, it may still be possible to reidentify personal information through security attacks, or by an adversary familiar with the data set.

One thing’s for sure: CIOs can combat the threat. The best practice and technologies to achieve balance between process mining and safeguarding privacy rights is through the application of responsible process mining. Here are six key ways to ensure deidentification methods are successful while maintaining the integrity of process mining – and reaping its rewards:

1) Risk of re-identification

First, evaluate the risk of re-identification associated with analysis of the event data. After all, it’s better to be safe than sorry.

Clearly, if the event data contains personally identifiable or sensitive personal information, it must be anonymized and substituted with a replacement value. However, there still may be a possibility of reidentification based on combining event log attributes with other available data sources. The secrets, here, are in the data.

2) Mitigate the possibility of reidentification

Once the risk of re-identification has been assessed, it is important to absolve any possibility that personal information can be found. This can be achieved with a data governance structure and policy.

This involves initially evaluating the intended uses and users of event logs collected for analysis and then determining the variables included before measuring their reidentification risk. Finally, it is important to document results consistent with data privacy and security requirements.

3) Control the conditions when data is used

A key way to ensure the de-identification methods are successful is by specifying the way data can be used.

There are different “release” models associated with the secondary use of personal information: public, quasi-public, and non-public release models. Public release models should apply the most stringent de-identification protocols while quasi and non-public release models ought to include specific contractual provisions as to the confidentiality and terms of use.

4) Nature of variables

Evaluate the nature of the variables in event logs by asking a few questions: Do they contain sensitive data? Do they include indirect identifiers that may create risk of reidentification? Are there additional sources of publicly available data that may be linked to indirect identifiers in event logs? What is the likelihood that an adversary who may be familiar with the event logs would be able to reidentify data subjects?  

It’s important to uncovering variabilities in task execution to enable automation safeguards that mitigate the risk of compliance violations. By asking more questions, CIOs are able to tick more boxes in their path to ensuring privacy is safeguarded.

5) Measure and identify reidentification risk

This step will depend on the context of event logs, the number of attributes that comprise event logs, and the number of similar attributes, referred to as equivalence classes. The fewer the equivalence classes, the higher the degree of probability of reidentification. In these instances, more rigorous deidentification measures need to be considered.

6) Record internal controls

It is important to stay on top on privacy and that involves documenting internal controls that can protect these privacy rights.

For example, the General Data Protection Regulation imposes rigorous obligations on data controllers and processors to maintain a record of processing activities under its responsibility. Furthermore, organizations are subject to audit provisions and, upon request from supervisory authorities, must co-operate with the supervisory authority and make those records available.

Process mining and data privacy can co-exist

Today’s information systems generate an unprecedented amount of data from both digital and physical sources. Understanding process data in more detail empowers organizations to gain insight to their business processes flows and which processes need to be improved.

Process mining can provide your organization with comprehensive insights about your processes and fuel your improvement initiatives. Process bottlenecks and inefficiencies in the customer journey, along with tasks that consume a high degree of time, manual effort and cost, as well as processes where being non-compliant poses legal and financial risk, are among the key challenges organizations face today.

The ambition of responsible process mining is to achieve a balance between its utility and safeguarding privacy rights. Process mining must strictly protect sensitive information through a multi-faceted set of data security services. Now more than ever, digital security is incredibly important, which means that businesses must do all they can to improve their cybersecurity efforts. Integrating privacy-enhancing technologies and best practices will create trust and confidence in their continued growth. It is in the best interest of the industry to adopt privacy enhancing practices. The goal is to gain new insights into processes, safely.

Andrew Pery, Ethics Evangelist, ABBYY