Skip to main content

Making decisions with data – the role for machine learning in analytics

(Image credit: Image source: Shutterstock/alexskopje)

Machine Learning is a complex area of computing for those with deep technical knowledge and the ability to translate between business requirements, large data sets and how computing systems develop. At least, that is how it has been since first discussed in 1959, and continued until today. 

Over the next few years, it’s been predicted that Machine Learning will become business as usual. Initially, companies will use machine learning to automate certain functions, such as pattern recognition, and improve efficiencies. Over time, Machine Learning will expand to automate more of the analytics steps involved within jobs. 

The results from this work are also predicted to be substantial. Forrester predicts that, by 2020, about $1.25 trillion in revenue will be generated by companies that run their operations based on customer insights – compared to $250 billion in 2015. This will be based on winning customers away from competitors through better execution and a better experience. 

To make Machine Learning work in practice, more automation will be required to embed this technology in applications that normal business users will have. However, while the automation process will make the results easier to generate, there will still be a need for IT to understand the process involved. Without this understanding, the results will be mixed. 

Behind the Machine Learning curtain 

At its heart, Machine Learning refers to how computers learn how to improve the performance of a given task without being given specific instructions. This is based on techniques that help computers organise sets of data and then make use of the data. As part of this, the data and results being analysed can then be used to provide feedback that the system can use to improve operations. 

There are multiple different ways of handling data, but the main two groups are supervised and unsupervised learning. Supervised learning uses an existing model to train the system on what results are expected, and then lets the system provide its validation of those results. This approach works best when the required result is pretty well defined already and you want to query data sets for predictions on what might happen in the future.   

Supervised learning can itself be divided into two categories: Classification and Regression techniques. Classification algorithms aim to divide data into distinct categories, while regression techniques are applied where data is in one continuous set. While classification can be used to split sets of data into specific groups and then predict which of these the next will be categorised into, regression can predict where along the scale the next output will appear. 

On the other hand, unsupervised learning provides data to the system and then lets it be analysed. This approach is often used when you aren’t certain of the relationship between data points, or when you are looking for connections that aren’t immediately obvious. This involves plotting the data and then clustering the resulting information to see relationships or patterns. This can then be applied to new sets of data to see if the results meet the same pattern. 

Behind these groups, there are lots of different, specific algorithms that can be deployed. While there are lots of tutorials on how to pick the right algorithm for a given task, the most important criteria for machine learning is that it should help the user make a more accurate decision.

What aims should Machine Learning projects have in place from the start? 

To deliver these more accurate decisions, it’s worth spending time on what results the project should deliver from the beginning. This may not be as easy as it sounds, as different individuals may have their own definitions of what results should be created.   

A good example is sales. For a CEO or business leader, analytics results could involve looking at the impact of macro-economic trends and getting predictions on performance over time. For a sales director, these results would be interesting but not useful. Instead, he or she would be looking for guidance on how to make the most of each potential interaction with a customer. 

Both roles within the business can be helped through greater automation, but the process to get to these results is very different. For the sales director, the improvement can be found by looking at previous sales data and what influenced other customers. 

The aim for a Machine Learning project should be how to predict what next actions would best benefit the business. For the sales director, this could be a prediction on what piece of content should be offered to move the sales process along, or the best deal to put together to win the business. For the business leadership team, this area is a more complex one, where market trends and customer success have to be put into wider context.

The latter approach can be used to show how different teams and activities are providing results back to the business, as well as the percentage chance that any action will result in a benefit. Using this data, decisions on investment can be made that take the whole landscape into account. 

As part of this, teams have to think more carefully about the kinds of data that they capture over time and the metrics that they use. To continue our example, looking at sales volume data on its own can provide some insight, but it does not take wider market growth or external impacts into account. A small percentage increase in sales may be a huge win in a declining market, or it could be a sign that a team is under performing compared to the wider market. Without more data to provide context, it can be difficult to determine what influencing factors are the most important.   

Automating your data preparation for Machine Learning   

To deliver results for business users with Machine Learning, IT teams can’t be the central organisation carrying out the analytics and then providing results. This doesn’t make the most of automation, and it also puts a human layer between the business users and the data, which can lead to miscommunications or poor results. 

Instead, it’s important to look at what data is held centrally and what data sources might be brought in by individuals to complement this data. By combining central and “local” data, business users can get insights that are more valuable to them. However, relying on business users to prepare their data is asking a lot. 

By automating some of the processes of data preparation, and thus reducing the number of human steps involved, business users can more easily bring in their own spreadsheets or external data sources. What’s important is that business users should be able to see the results of the relationships between data sets. For example, a business user might want to look at overall sales compared with local marketing analytics and external market data. By combining these data sets together, the user can look for patterns on which customers were influenced by marketing campaigns and how this compared with other trends that were observed. 

Automating the data preparation process can help business users see the value of Machine Learning and pattern recognition. By guiding the workflow for how data is prepared, and using this process to generate smarter recommendations on what to do with the data, it’s easier to cleanse, merge and refine data being imported. Making use of common user interface designs, such as drag-and-drop support, should remove the need for complicated scripting and IT involvement. 

As part of this process, Machine Learning can be used to create a semantic layer within the analytics results that can give users a business interpretation of the data using common rules and definitions, as well as demonstrate where results have been derived from. This enables everyone to see the history of work that took place to deliver results and helps grow confidence in the results that are being created. By creating this “data lineage” everyone can see both what the data means and where it comes from.   

Machine Learning should provide business users and analytics teams with faster access to data and better results for decision making over time. However, it is not a magic wand that can deliver outcomes on its own. Instead, it’s important to look at Machine Learning as a way to automate how businesses connect their data, share the results with others, and ultimately collaborate around the results.   

Pedro Arellano, Vice President, Product Strategy, Birst

Image Credit: Alexskopje / Shutterstock

Pedro Arellano
Pedro Arellano is vice president, product strategy at Birst, leading development around networked data and analytics. Prior to Birst, he led marketing at MicroStrategy and hosted the Stereo Gol radio show.