Skip to main content

Making decisions with data – The role for data preparation

Making use of data within companies is one of those challenges that should deliver huge results. In its FutureScape predictions for 2016, analyst firm IDC estimated that, worldwide, those companies analysing all their information for actionable insight should see an extra $430 billion in productivity gains, compared to those that did not look at all their data.

However, while it’s relatively easy to say that you want to become data-driven, it’s much harder to achieve this goal.

Where are we now with data discovery?

One approach that individuals within companies have taken is to make use of desktop data discovery tools to give business analysts more insight. These tools are user-friendly, approachable and provide analysts with a great shortcut to visualising data in new ways. By using data discovery in this way, individuals can avoid some of the pain involved with juggling spreadsheets or requesting data reports from central IT teams.

However, this short-cut relies on having the right data to work with in the first place. Preparing data so that it is ready for analysis is an essential step to take, and it’s one where ordinary business users will require the most assistance.   Traditional Business Intelligence implementations included data preparation as part of the overall approach to delivering what the business requires. As these projects tended to be large, business-critical ones related to finance and company performance, the data preparation element was handled by central IT teams that were experts in how to work with data.

For desktop discovery, simple visualisations of data from a single spreadsheet or application can be created without much data preparation effort. However, combining different spreadsheets or results from multiple applications is a different story. Without proper preparation time and skills, the results from this analysis may be misguided, incomplete or actually harmful.

For analysts, the amount of time spent on data preparation can be deceptive. In 2015, Ventana Research estimated that about 45 per cent of business analyst time was dedicated to checking data for quality and consistency, as well as preparing these sources for analysis. Reducing this amount of time increases productivity and opens up more opportunities for those analysts to develop and share insights based on their analysis work.

The challenge for data preparation: self-service

Automating the data preparation phase within wider analytics projects should help company teams get more value from their data. However, this is not a simple step. Part of this is due to the audiences, on the business side, that want to get more out of data. When we talk of making it easier for the business to use data, there is a big difference between business analysts who have developed skills in working with data and other employees who have not. Getting data preparation in place can currently help the former group, but not the latter.

The reason for this is that business analysts already are data savvy. While data preparation tools can help them get results faster, they already have an understanding of the fundamental concepts around information management – such as tables, keys, relationships, and joins – that are necessary when getting data ready for analysis. In this case, the automation covers a ‘known process’ that is now easier.

For most employees, these concepts are unknown. While data preparation tools can be offered to them, they are not familiar with how to get data ready to be used in the first place. In these cases, most users stick with their existing spreadsheets and continue to struggle along. To bring data to the wider business, the data preparation process has to be made useful and understandable to all users, rather than just business analysts. Without this step, it’s very difficult to scale up use of analytics to all business users.

There are a few ways that this can be approached so that all business users can benefit from faster access to data. The emphasis should be on making things simpler to achieve, more automated, repeatable over time, and ultimately available to all users through self-service models – rather than only for those with existing data preparation skills.

One approach involves semantics. Semantics is how we define the meaning of things. In the technology sector, it covers how metadata on IT assets, files and data can be used to manage analytic projects more efficiently. By looking at the semantic information about data, it’s therefore possible to link up data sets around common themes or subjects. This linking – or networking – of data prevents the proliferation of information silos, and it promotes trust when using data to make business decisions.

An example would be looking at customer records across all the business systems within the organisation. Customer details can be captured in marketing, sales, finance, support and supply chain applications. By looking at the semantic data about those applications, it’s possible to link up that information automatically and then use it to organise analytics accordingly. Each team across the business will be interested in different results and reports, but the data behind the analytics is trusted because it is based on the same set of definitions, and not coming from individual silos of information.

In addition, making more use of visual aids during the data preparation phase can help non-analysts carry out their own self-service analytics. The phrase, “I’ll know it when I see it,” may be a frustrating one for experts, but for many people, this approach can help business users find their own answers. At the same time, this should not be seen as a case of letting employees use data on their own, without guidance.

Instead, users can and should be guided to combinations of data that are going to work well, rather than against each other. By providing guidance as part of the process automatically, IT teams can ensure that users of all kinds get what they need out of data without having to prepare the data for them. This same approach can be applied to visualisation or other analytics tools that end-users might start to fashion for themselves. The aim for IT should be a supporting role, which provides guidance and supports users helping themselves. This enables IT to avoid taking responsibility for the analysis task completely or leaving users completely to their own data devices.

This emphasis on data preparation is an important one for all companies that want to get more out of their business information. Data and analytics initiatives should provide business users, across the company, with insight, which is available to them at all times. The alternatives involve retreating into more traditional IT models that rely on highly skilled central teams, or the development of more business analysts, within each department, who can wrangle data.

While these skills with data can ultimately speed up some analytics projects, getting the whole business to use data on a day-to-day basis should be the goal. Self-service BI and data preparation for all business users – not just analysts in IT or the business – is an essential step toward meeting that objective.

Pedro Arellano, Senior Director of Product Strategy, Birst

Image source: Shuttterstock/Bruce Rolff

Pedro Arellano
Pedro Arellano is vice president, product strategy at Birst, leading development around networked data and analytics. Prior to Birst, he led marketing at MicroStrategy and hosted the Stereo Gol radio show.