The big data boom: What, why, and how?

Big data is a term ceaselessly bandied around the IT sector, but to what extent have you really got to grips with it? Deepening our collective understanding of the subject is Dr. Andrew Jennings, the chief analytics officer at FICO and head of FICO Labs, in our special Q&A.

1) What were the key milestones in the history of predictive analytics?

Many of the early milestones came from military applications in the 1930s and 1940s. For example, Alan Turing and IJ Good developed some ground-breaking work on assigning weights of evidence to specific variables when they were involved with decoding German codes in World War II. The 1950s and 1960s saw the development of methodologies for modelling, such as the work done by Bill Fair and Earl Isaac on credit scoring. In the late 1990s, the rise of Internet search and personalisation by eBay, Amazon and Google certainly set the stage for the rise of big data. You can see more milestones in FICO's recent analytics infographic.

2) What are some of the common uses of predictive analytics today?

Predictive analytics is widely used in the travel industry, both to set flight paths and ticket prices and to help consumers find the best prices. In the credit industry, it's central to both risk assessment and fraud detection. And, of course, marketers in many industries use it to identify the best offers for each individual.

3) Big data is undoubtedly a hot topic at the moment, but are there many companies already using big data insights in their day-to-day operations?

Yes, and some companies have their entire business model based on analysis of big data. One example would be Farecast, a company formed to help consumers determine when to buy airline tickets in order to get the best price.

4) How has the rise of big data affected the use of analytics?

More companies today realise that they're not going to be competitive if they can't put data to work. Whereas before, most analytics were what we would call business intelligence, focused on reporting, today companies understand the level of personalisation required to compete with online giants such as Amazon is only possible if you can understand your customers much better, and act on that insight with more personalised service. This has driven a huge increase in demand for analytics — the analytics software industry grew from $11 billion (£7.2 billion) in 2000 to $35 billion (£23 billion) in 2012.

5) What kind of impact will text analytics have?

Text analytics and its counterpart, speech analytics, will have a massive impact. In order to build predictive analytics models, information has to be provided in a numerical form. Natural language processing enables text and speech to be converted into a digitised format that can be used in modelling. Since most human communication is language-based, we will have a much larger set of data to use in models, enabling us to really crack new problems. For instance, the terms people use when they do online searches are being analysed to identify the outbreak of a disease in a specific region.

6) What does putting analytics into a cloud computing infrastructure mean for the industry?

The cloud lowers the barrier to entry for analytics. More companies than ever before will be able to access analytics, without having to spend a great deal of money on software tools and hardware. First, businesses can "build in the cloud" using modelling tools. Second, businesses can access analytic services pre-developed for specific business problems, or tailor analytic services to their business rapidly. Third, an advanced cloud can put businesses in touch with a community of analytic experts. Fourth, some clouds create an "analytics marketplace" — an app exchange or app store for analytics developed by third parties.

7) On the one hand big data is believed to be a solution to many pressing economic and societal challenges. On the other, privacy advocates argue that once data's been collected, we have no control over who uses it or how it is used. How can organisations overcome this negative perception and is there such a thing as a big data code of ethics?

There is no big data code of ethics, but there certainly are controls over who can access what data, not only privacy regulations at the national level but also at the industry level. An individual's financial data, such as the data reported to a credit bureau, has very strict regulations, for example. The challenge is that new sources of data are coming online quickly, and in some cases there may be a lag between when data becomes available and when the regulations are put into place. It's important for every business using data to follow the spirit of privacy regulations, and to consider whether their use will compromise individual privacy.

8) Are there enough experts in analytics to meet the explosion in business demand?

No, and this is a problem. Between 2011 and 2012, job posts for "data scientists" jumped by 15,000%. There is a talent gap worldwide, and people who have been trained in analytics, statistics and operations research are in demand.

Unfortunately, the global demand means we are seeing a number of people declaring themselves analytic experts who are not as well-trained as the professionals doing the job already. However, most analytics companies and most analytics teams within businesses are led by analysts who can tell whether a job applicant has the necessary skills. The ideal analyst has the maths skills, the mindset of a problem solver, and good communications skills. There are certainly some strong universities across Asia that are world-renowned for their analytics programmes and graduates, including RenMin University, the University of International Business and Economics, the Indian Statistical Institute and the Indian Institute of Technology. The Harvard Business Journal has called data scientist the "sexiest job of the 21st century," so this is an excellent time to be an analyst!

Dr. Andrew Jennings is chief analytics officer at FICO and head of FICO Labs. He blogs on the FICO Banking Analytics Blog and the FICO Labs Blog.