Back in the days of Eliza, Alice and Jabberwocky, the first chatbots developed in the 90s, capability was still rudimentary. When confronted with the complexities of human communication, they got very easily confused. Ultimately, they were flowcharts and their responses resulted from relatively rigid if/then scripts. If asked what is your name, then answer Alice.
Fast forward thirty years and research in AI has given rise to various conversational interfaces in machine learning and natural language processing. As a result over the last few years, there has been an exponential growth of models to detect patterns in human language and determine intent, especially when what is said doesn’t quite match what is meant. And this is critical, more so now than ever, because increasing numbers of people are now requiring online assistance due to the shift in behavior resulting from the global pandemic.
Bots need to change how they communicate. And one way this can be done is using a neural net structure to learn word associations for large bodies of text to derive sentiment analysis and natural entity recognition from word similarity. This is achieved by exploiting a linguistic concept called sentiment proximity – simply put, the concept that similar words occur together more frequently than dissimilar words.
Each distinct word is assigned a particular list of numbers called a vector. The model defines the dictionary using a vector space of these words in 300 dimensions where the similarity in direction of two vectors holds information as to the similarity in sentiment of two vector words. Then, through a process of looping through the sample text it is trying to read, it can fit a model based on neighboring words of a pre-defined number either side. To do that the neural net is used, saving the weights from the first layer of training. Words do not need to be next to each other to be detected as similar after a large enough training time - if generally they are surrounded by similar words it can be assumed linguistically that they have similar meaning.
Dual architecture method
The model create links between the surrounding and target words in a body of text using skip-gram and CBOW (continuous bag of words) models of processing. These methods use neural nets of high weighting to distil and train semantic information about the language in a text by training based off of their relationship to surrounding words. They do this by iteratively trying to predict a target word from the words around it, and by trying to predict surrounding words from a target word respectively. This semantic information is then stored in the first weighting layer of the neural nets used in CBOW and skip-gram called the embedding layer. This can be then multiplied by a pre found representation of a word to extract Word vectors for any of the input words in the training text. This creates the vectors to be mapped into the space described above.
A dual architecture method is very effective. The use of the CBOW method allows for faster processing of confidence values and has better representation for more common words in a text body, while the skip-gram method works well with smaller datasets and allows for strong representational values for rarer words in a text body.
It can also create mapping of the word vectors in as many dimensions as comparison would entail. To simplify mathematically, principle component analysis maps the vector space into a graphing system of the coder’s choice where each dimension or axis are picked to represent the most useful data (the data with the largest variance on the data value plotted) where the “principle” components are chosen and the other dimensions of the vector are ignored for clarity. This can allow for interesting sentimental mapping of a datasets. It can provide a nuanced description of proximity, and by extension similarity of vocabulary that more rudimentary methods of NLP such as entity/intent based recognition may miss.
It can be seen that a weighted value of proximity, either through direct comparison or weight mapping can be derived to show the sentimental similarity between language choices in a body of text. This can allow for a more nuanced form of language processing which can be seen for a given question, for example:
“Who is the Michael Jordan of golf?”
This as a given input for an entity intent method of NLP would probably stop at detecting the entity of a Basketball player and the subject of golf before coming undone, however with the machine learning based model of NLP the vector mapping could be used to calculate the Euclidean distance between the word vector ‘Michael Jordan’ and its highest entity pair (basketball) and use that to iteratively find a node of similar minimized distance from gold, say, ‘Tiger Woods’.
The above example shows a new layer of complexity that can be added to questioning of chatbot engines.
To demonstrate a more enterprise example we’ll use a question relevant to a medical aid company.
The question could be posed to the integrating SMS bot by WhatsApp or text:
“Find me the best medical center near me for prescriptions”
The use of the phrase ‘best’ would normally be problematic under less rigorous models, but with pertinent training, the vector mapping for the word vector ‘best’ in a medical context could return information such as shortest waiting time, lowest logged error in medication rate and lowest price as stored within the word vector which can then be fed through the NLP engine to return the ‘best’ centers based off each of the parameters described above.
Hopefully the differential in processing avenues speaks for itself, but additionally this machine learning model of NLP would also potentially increase the accuracy of intents and entities for more simple questions. For example, in cases where a specific term such as specific medical language or a niche term cannot be derived in the architecture into an intent or entity, the word vector form could be used to find the closest synonym for the word that is under the accepted key word weightings that the engine accepts, decreasing the likelihood that bots would not be able to understand a given question.
This ability to enhance the complexity of chatbots and contextualize them has a number of implications. It means that a chatbot can do more than just hold conversations with customers. This form of contextual AI upskills chatbots to include multiple functions from customer service through to mental health and wellness monitoring. It is becoming increasingly clear that platforms are becoming so smart that they can be used to automate processes that you didn’t even know could be automated and ones that you don’t even know you need yet!
Charlie Masters, head of research, Vroomf