Skip to main content

The evolution of voice recognition technology

(Image credit: Image source: Shutterstock/polkadot_photo)

It may seem that the device we all rely upon so extensively, the smartphone, has been with us forever and we can’t imagine a world without one. The truth, however, is that mass acceptance of smartphones only occurred with the release of the first iPhone in 2007 and accelerated rapidly until Android became market-share leader in 2012. These were followed by the advent of the tablet and more recently, the huge growth in smart assistants such as Google Home and Amazon Echo.

With the smartphone came a completely new way to do business and connect with customers. The birth of the App has changed irreversibly the way consumers shop, transact, research and communicate. The web, a huge advancement in electronic services and commerce itself, is no longer the preferred medium for many people. Smart assistants are the new frontier as businesses realise they offer a new medium of reaching and transacting with their owners; they are more than electronic DJs or light switches.

What is clear is that with the introduction of these devices, the smartphone especially, businesses have had to adapt their strategies and adopt the technology or lose market-share. Every major bank, for instance, must have a smartphone app. In fact, most new challenger banks and many remote healthcare providers, amongst other industries, are based solely on smartphone apps.

In parallel, with the emergence of these new channels, there have been major advances in Artificial Intelligence (AI) and in particular AI chatbot technology. These chatbots are designed to conduct a lifelike conversation with a customer and provide many benefits, not least being cost reduction.

Chatbots can be integrated into any customer-facing channel and Gartner predicts that chatbots will be involved in 85 per cent of all customer service interactions by 2020. From experience in working on projects involving this technology, such as IBM’s Watson, and seeing the possibilities AI chatbot technology brings, this prediction is not that surprising.

The combination of the new channels, devices and the rise of AI chatbot technology, along with existing speech recognition technology, has seen voice emerge as the new User Interface (UI) with machines.

The growth of IVR

The keypad is fast being replaced by voice commands in many apps, and Interactive Voice Response (IVR) systems have been using voice as the sole or dominant UI for years. Smart assistants such as Alexa or Siri operate solely on voice commands.

Voice, therefore, is already changing the way businesses operate from a customer interaction perspective, insofar as understanding what the customer is saying or requesting. This is fine if requesting a music playlist or making a general enquiry. However, voice as a UI is not and must not be restricted in the scope of potential use-cases. Banks are already deploying “skills” for smart assistants and chatbots can do so much more than just answer queries; they should be empowered to execute transactions, including ones that carry financial or informational risk. They cannot effectively provide customer fulfilment, which is their primary objective, if they cannot authenticate the users (i.e. recognise users’ identity with certainty).

This is where voice recognition extends the reach of speech recognition; not simply understanding what is being requested, but who is making the request. As the use-cases increase so too does the need for strong authentication. It is vital, however, that the authentication or identification of the user is as frictionless and as natural as possible. The way to achieve this is to make the authentication part of the actual UI, i.e. a speech driven command requires a simultaneous speech driven authentication.

Take, for instance, the example of a banking skill deployed onto a home smart assistant. Without the ability to identify who is speaking, the bank would be restricted in deploying real functionality within their skill. Speaking a PIN or password that can be overheard would hardly be considered a secure solution. However, if the bank could identify the family member (or friend, or intruder) as being or not being the account holder, such functionality could be deployed. This would increase the bank’s customer reach and create a new (secure, efficient) channel for transacting. Considering that Gartner, in 2017, predicted that by 2020, 75 per cent of US households would have a smart speaker, the benefits of enabling this channel are obvious.

Similarly, a speechbot integrated within an app could not be authorised to provide sensitive information such as financial or healthcare data ,for instance, without understanding if the requester is authorised to access the information. Analysing the voice of the requester, whilst making the request, is the most simple and secure way of achieving this. For digital healthcare companies and digital banks, this is their raison dêtre.

Gartner, in a 2017 Hype Cycle analysis into the Identification and Access Management (IAM) sector, had biometric authentication at the rear end of the “Slope of Enlightenment”, meaning biometrics were already in the mainstream of authentication methods. The “Plateau of Productivity”, what Gartner refers to as fully mainstream and expected/prerequisite technology, was forecast 2 years further, i.e. 2019 or today. At around the same time as the Gartner report, Visa® commissioned a survey into the use of biometrics as a form of payment authentication which provides an intriguing insight into the rapidly growing acceptance and understanding of biometrics by (US) consumers.

Perhaps the most pertinent observation from the survey results, for service providers at least, was that 50 per cent of respondents said that they would consider leaving their card network, bank or mobile network operator if biometric authentication was not offered. Strong, and user-friendly, (biometric) authentication is therefore a very real selling point and marketing tool, as well as effectively combating fraud.

Voice recognition, whether used for authentication or identification, will become increasingly crucial not only in the expansion of use-cases for speech-driven command applications, but also in providing more self-service and self-certification opportunities for customers and consumers. As reflected in the Visa® survey, it will be as effective as a tool for customer retention and acquisition as it will be for cost reduction and fraud prevention.

Pat Carroll, Founder & Chairman, ValidSoft
Image source: Shutterstock/polkadot_photo

Pat is the founder, Executive Chairman and CEO of ValidSoft. He has over 25 years of experience in Information Technology and Financial Markets, where he has been an industry thought leader, and dominant expert in security, strong authentication and voice biometrics.