Big data is the business buzz word of our era, bandied around at conferences and in the press as the universal panacea. However, data on its own is not the answer. As an unprocessed asset data is a cost centre, not a source of profit. Where the ROI lies is in what you do with the data and how you leverage it to drive business decisions, and the answer to that lies in data science.
Long the preserve of academics and rocket scientists, data science is now front and centre of business strategy and is one of the fastest growing areas of technology. Using advanced statistical techniques to extract value from data can be transformational for businesses, boosting existing revenue streams, creating entirely new sources of revenue and identifying areas of inefficiency and waste.
But what has driven this change?
There are two key factors. Firstly, the digital transformation of businesses has created entirely new sources of data. In the physical world, companies didn’t have the end-to-end visibility that they now have, from production through to consumption. This has created petabytes of data resources that can now be interrogated. Secondly, the cost of working with this data has massively decreased.
Cloud based storage and processing capabilities such as Amazon Web Services have meant that previously unimaginable quantities of computing power are available for a fraction of the historical cost. This shift has led to many seminal moments in the development of data-driven decision making, with DeepMind’s victory at strategy game Go, IBM Watson’s success at Jeopardy and Algorithmic Hedge Funds amongst those that have claimed the headlines recently.
However, in order to better understand how data science can help your business, its first necessary to understand what it is and how it’s being used right now.
What is data science?
The techniques of data science are, as you might imagine, complex and can often seem inaccessible to the layperson. However, there are five key tools that are regularly used by businesses to access the value hidden in their data assets.
Clustering: It is the science of how to group things together. Examples of clustering include segmenting customers into groups and profiles with distinct characteristics, trying to spot hot-zones for outbreaks of viruses or infections, or group similar types of hotels in an area together. Clustering algorithms help impose order on datasets that may otherwise be hard to gain insight into, and enable the data to decide groupings for itself.
Regression: It's the act of predicting a number based on input data. For example, how much money a person will want to spend on a website based on their previous purchases, age, gender, address, or it may be predicting the amount of demand products may have in the future, or predicting how long a customer may stay before leaving.
Classification: One of the most commonly used categories of problems in data science, classification is labelling whether a set of data, A, belongs to a category Y or not. For example, typical questions might be ‘Will this person I am lending money to default on the loan?’ or ‘Will this person be entering the hospital be infected by a superbug whilst staying here?’, or ‘Is this transaction fraudulent or no?t’.
Natural language processing: Natural language processing (NLP) uses software to extract meaning from text. Applications range from simple sentiment analysis (‘Is this tweet positive or negative?’) to AI assistants in the form of IBM’s Watson.
Optimisation: This is a big part of data science. Mathematical problems such as the ‘traveling salesman’, which requires the most efficient route between locations, have been around for a long time, but the era of cheap and vast processing power allows us to solve many of these problems faster and more accurately.
How do I use data science?
These techniques taken in isolation could be seen as just mathematical game playing. However, when applied to real world problems, they can help fundamentally change business decision making. There are innumerable examples of how data science has impacted businesses, but a few core use cases might help illustrate the potential.
Customer churn: Using data about a telecoms company’s customers. It can be relatively straightforward to track the likelihood of customer churn in an individual user. By mapping variables such as how many times a specific customer has called customer service, how much they spend on day, evening, night time and international calls, and whether or not they have a voicemail plan, you can identify the groups that are most likely to churn and take action to retain them.
Rental pricing: A clear example of the power of data science is demonstrated by a pricing optimisation algorithm for property rental. By mapping occupancy against pricing, area, property attributes and reviews, you can establish which properties are relatively underperforming against their peers.
This insight can then be used to identify those properties that are either over or under-priced. From an analysis of openly available AirBnB data, I was able to identify an average £11.20 potential improvement in rental pricing on underperforming properties, an implied addition £130,000-£880,000 uplift in London alone.
Product recommendation: One substantial area of data science in the last few years has been the ability to recommend products with better accuracy, thus boosting sales. The particular technique implemented here is collaborative filtering with implicit feedback, based on the idea that users who purchase or view similar items will have similar tastes. Using these approaches, I have been able to generate 88% accuracy in product recommendation for customers, leading to increased sales and customer loyalty.
What are the challenges?
This is merely a snapshot of a large and fast-growing area of computing, but it demonstrates the broad and significant benefits that the approach can bring to businesses. However, while the logic behind the science is clear, as a business approach it can require significant efforts to implement.
It can’t just be the preserve of the IT department; it has to be a cross functional effort combining both technology and business expertise. Implemented correctly, however, data science can revolutionise business decision making and materially impact on an enterprise’s performance.
Henry Brown, data scientist at BJSS
Image source: Shutterstock/alexskopje