Skip to main content

Talking with data: Deriving value from data through Natural Language Processing

Becoming data-driven is the new mantra in business. The consensus is that the most value can be achieved by getting the data in the hands of the decision makers at the top. This is however only true if these data consumers know how to handle data. Getting managers to think like data scientists is one way to approach this challenge, another is to make data more approachable and more human. As such it is no surprise that Natural Language Processing is the talk of the data-driven town.   

Emergence of a new type of language

When considering the next generation of digital user interaction Natural Language Processing (NLP) might not be the first that comes to mind, as it is by no means a new concept or technology. NLP, as referenced on Wikipedia, is a subfield of computer science, information engineering and artificial intelligence concerned with the interactions between computers and human (natural) languages. In particular, how to program computers to process and analyse large amounts of natural language data. 

Developments in the related disciplines of ML and AI are propelling the use of NLP forward. Industry leaders like Gartner have claimed conversational analytics as an emerging paradigm. This shift enables business professionals to explore their data, generate queries, and receive and act on insights in natural language, whether it be voice or text, through mobile devices or personal assistants. 

Becoming fluent in NLP

When facing strategic obstacles that can hinder innovation and muddy the decision-making process, such as organisational silos, Deloitte found that organisations with leaders who embody the characteristics of the Industry 4.0 persona the “Data-Driven Decisive” are overcoming these roadblocks through a methodical, data-driven approach and are often bolder in their decisions. 

In order to effectively apply data throughout an organisation, companies need to provide employees with a base-level understanding of the importance and role of data within their business. And, looking at the overwhelming demand to be met here, the challenge needs to be approached from both ends through education and tools. 

Teaching employees how they can use data and which questions to ask will go some distance to establishing a group of data-capable individuals within the workforce. By giving them the effective media through which they can consume data, the number of people that can effectively ‘read’ (manipulate, analyse and visualise) data in a way that allows them to make decisions based on it, increases exponentially. 

The aim is not to convert everyone into a data scientist. Data specialists will still be needed to do more forward-looking number crunching, and both groups might yield different solutions. Natural Language Processing as used in Tableau’s Ask data solution mainly aims to lower the bar for all the non-data experts to use data to improve the results of their day-to-day job. 

Where NLP does the heavy lifting

The strength of NLP lies in its ability to sift through hundreds or thousands of data tables in milliseconds, thereby enabling the technology to find potential hits that correspond with human expressions. As such, NLP allows employees to slice and dice data without being confronted with the overwhelming amount of options that exist to filter data. 

New AI techniques enable NLP to do a lot of the heavy lifting. If you ask an NLP-driven analytics system for the “average price of gas by region”, the system does not only search for prices of gas, but also ‘knows’ to aggregate by region and show the average. 

Sentiment analysis and intent inference have also come on in leaps and bounds. The mature field of computer linguistics is accompanied by subfields that are still in their infancy, such as conversational analytics. In simple terms, the computer understands not only what we say, but is also getting smarter at comprehending what we are might mean through using certain language.   

Deciphering ambiguity

Inference remains an area where things can get a bit complicated. NLP is good at interpreting language and spotting ambiguity in elements when there isn’t enough clarity in data sets. 

While the business user enters a search term in Ask Data and sees the answer being presented in the most insightful way, by pulling out the right elements from the right tables and variables, the actual SQL query under the hood is hidden from the user’s view. 

NLP is good at leaving no stone unturned when it comes to solving for problems. But it’s not the best interface when the user doesn’t know enough about what they’re looking for and can’t articulate a question, and would rather prefer to choose from a list of options. For example, a user might not know what the name of a particular product is, but if they click to view a menu with a list of products to filter through, they’ll be able to make an easier choice. This is where mixed-modality systems shine.

NLP is still not the most effective at resolving a query when there’s lots of room for interpretation, especially when it hasn’t seen the specific query before. For example, if a colleague were to ask to ‘email Frank,’ then we as humans tend to know to look for the Franks we know professionally, not the Franks in our family or circle of friends. As humans, we have the advantage of tapping our memory to inform the context of a request based on who is making the request. NLP still has some catching up to do in this department. 

Enabling a culture of data

For companies looking to start talking with their data, the most important first step is to enable a culture of data. It is also important to pay attention to the needs and wants of the people that are required to handle data. 

As with a lot of other implementations, starting with a small team and then expanding tends to be a successful approach. By equipping your team with the tools needed to explore data and ask questions, the team will then get exposed to the new ways data can be accessed. It’s also vital to make them aware of the growing global community resource of data explorers, that function as a sharing economy of tips and tricks. 

Lastly, as functionality is still very much in the developmental phase, providing insight to vendors to inform product updates and new capabilities is invaluable. Endless chatter will get you nowhere. Meaningful conversations (with data) are the ones that count. 

Ryan Atallah is staff software engineer, Tableau (opens in new tab)

Tableau is using AI and ML technologies to bring data analytics to the masses and addressing the data skills gap at every level, with the company’s announcement this week of Ask Data – a new an intuitive way to engage with and analyze data using natural language processing. 

Ryan Atallah was the former CTO and co-founder of ClearGraph and prior to this was a Computer Science Section Leader at Stanford University.