The automated process vs. bad data

null

Organisations are continually striving to maintain their competitive edge. Faced with cost-saving and economic pressures, automating processes is one area which has the potential to have a dramatic impact. 

Fundamentally, automation is about reducing or removing as much manual work as possible and turning it into a process which can be run automatically by a machine or digital solution – typically with the advantages of speeding up the process and freeing up capacity. 

Picture the impressive images we see from car production lines. Teams of robot arms moving with precision and piecing together cars at speeds no human could ever replicate. But an automated process cannot only succeed by going faster or removing human interaction, it also must deliver a quality product.

Minimising mistakes through automation 

By reducing manual intervention, automated processes can minimise mistakes and human error – but there is still the chance that something can go wrong. Designers of automated processes need to ensure that the appropriate quality outcomes are being measured and assessed against a given specification. Importantly, this must happen throughout the entire process.

Let’s think about the car production line again. The cost of finding out that something went wrong at the start of the production process after the car has been built is significant. Instead, process designers will want to identify errors quickly and allow the process to make the necessary changes to ensure a quality product is delivered. 

A significant quantity of data is generated through automated processes, but the quantity of data does not compensate for the quality of the data. In order to deliver a quality product at the end of an automated process, a quality data management process is critical. But what is bad data? And, if everything is being automated anyway, why should we care? 

The importance of high-quality data

Data is the basic raw material used to generate the information needed to make effective decisions. As with anything, if the raw materials used are of a poor quality, the quality of the end product will be impacted – so high-quality decisions have to come from high-quality data. Key components to look for include validity, accuracy, completeness and timeliness. 

Before automating anything, organisations need to start by making sure the data they are generating is valid – and both the methods and tools used to measure data must be fit for purpose. If a piece of equipment is not appropriately calibrated, or a test method does not give the true result, then the automated process will not deliver and will instead provide false results (and, in turn, faulty products). Building in appropriate data quality checks at each stage of data generation prevents this from happening.

Having accurate and complete measurements is crucial, however, this can be challenging with automated processes where diverse sets of variables are frequently assessed from different sources. Ensuring that the accuracy of the data is not compromised when consolidating and processing data is not an easy task. 

Data must also be complete. Results are not just numbers, and things such as format and context are equally as important for data used in cumulative calculations. Without the additional layer of information that the context and the format provide, calculations can be faulty. 

Capturing context and removing risk

Consider, for example, the impact of measuring a duration without the associated unit. The value generated could be seconds, minutes, hours or any other of the many other duration units available. Not knowing the unit means we do not know what the value represents – and this can lead to significant errors. Without the necessary minimum information to base decisions on, it is even harder to automate the decision-making process. Vagueness, or even a touch of uncertainty, adds risk to the process.

Data sets can quickly become out of date, and this is particularly true with an automated process. To combat this, it’s important that the data being used to generate decisions and drive the automated process is current. The timestamp for the data being generated is a key piece of context to ensure the data is relevant to the process being assessed.

Successful automation has to be based on high-quality, precise and time-stamped data and the management of high-quality data is essential in order to avoid errors and to reduce risks – the two desired outcomes of automation.

Bad data is a problem that won’t disappear – all dataset are flawed, some are just more flawed than others. A focus on data governance and the adoption of data standards, where appropriate, helps mitigate the risks of generating information from invalid, inaccurate, incomplete data and fundamentally drives a quality through the automated process. 

Automation might seem like the obvious way to transform your organisation and gain a competitive edge – but never rush ahead without considering the quality of your data. Remember, automation with bad data can only mean one thing: the quicker production of unwanted and faulty products. 

Rory Quinn, Principal Solutions Consultant at IDBS   

Image Credit: Vasin Lee / Shutterstock