Skip to main content

Beyond smart speakers: What can voice tech really do?

(Image credit: Image Credit: Flickr / Best AI Assistant)

It comes as no surprise that Smart Speakers were one of 2017’s hottest gifts. According to a recent report, more than 7 per cent of Americans got a Smart Speaker over the holidays, with 4 per cent of Americans getting their first one. That brings the total to around 39 million people in the US that own one of these devices and 38 per cent of owners plan to buy additional smart speakers to control smart home devices. As a result, the Alexa app for Amazon Echo smart speakers topped the App Store free charts, hitting #1 in a few regions, with Google Home also making the top ten. According to Pew Research, nearly half of U.S. adults (46 per cent) say they now use these applications to interact with smartphones and other devices.

So why are smart speakers and voice-activated tech gaining such popularity? By providing an appealing, personal and natural means of interaction, voice has been opening the door for innovation since the telephone.  Using only your voice, you can now seamlessly play music, turn on your lights, order a pizza and get breaking news. That may explain why already over 30 per cent of smart speaker owners say their speaker is replacing time spent with the TV.

This is a massive and growing market, and both Google Home and Amazon Echo are the undeniable leaders. Though they stole the show at CES 2018, Google and Amazon aren't alone in voice assistant news. Samsung has revealed it's adding its voice assistant, Bixby, to its smart TVs and Family Hub fridges this year. Facebook is also working on a 15-inch smart screen similar to the Amazon’s Echo Show.

Beyond the home

Smart speakers and digital voice assistants have now become the centre of many companies' ecosystems, uniting connected environments at every touch point (smart home, connected cars, smart offices, smart cities, etc.). 64 per cent of smart speaker owners are interested in having smart speaker technology in their car. Consumers and businesses want smart everything and today's voice recognition systems can be set up to operate multiple applications. Beyond simply transcribing or taking dictation, time-saving voice recognition systems can even learn and create documents and emails using specialised and complex terminology.

Voice activation combined with chatbot technology is driving the new customer care revolution, generating customer experiences that are easy and efficient ways to build trust. Analysts at Gartner estimate that by the year 2018, “30 per cent of our interactions with technology will be through ‘conversations’ with smart machines”. As conversational interfaces become smarter, they will become more valuable in enhancing service and helping companies develop deeper relationships with their customers.

Amazon has revealed plans for Alexa for Business, enabling companies to set up voice practices for video conferencing, creating private voice skills, implementing Alexa commands for employees and connecting to enterprise services such as Microsoft Exchange, Salesforce, and more. Google is also poised to update its Assistant for the business environment.

More voice applications

Advances in interactive conversational artificial intelligence are seen by many as essential to the future in many applications – from marketing and security to healthcare and social services. The New York Times even named ‘conversational computing’ as one of the five technologies set to ‘rock the world’.  Some additional use cases for voice tech include:

  • Sonic branding: Beyond logos, colors and taglines – brands will be creating a sonic identity with easily identifiable ‘voices’ that represent their company and products. Brands want to leverage the same personalised experiences that have made voice assistants so popular to create emotional connections with consumers.
  • Hearables: Because of advanced in voice technology, hearables may be the next big thing in wearables. Healthcare providers are using a hearable by Fujitsu as a hands-free, real-time translator for non-Japanese patients.
  • Voice biometrics: Voice authentication analyses a person’s voice for hundreds of unique characteristics, then matches them to a voiceprint file. This type of voice technology can overcome traditional limitations in security systems and can also be used to identify suspected and wanted criminals.
  • Assistive technology: Used primarily by people who are deaf, hearing impaired, or who have speech and/or language disabilities, Augmentative and Alternative Communication (AAC) technology is an emerging field. Augmentative communication can be accomplished through devices such as computers or hand-held devices and is designed to encourage communication and social interaction to help people with communication disorders to express themselves.

Intelligent machines?

As machines grow ever more intelligent, they're emerging not just as powerful tools, but close companions. But how far are we from machines that can converse intelligently? In short – still very far. Cognitive services have made great advancements, but appropriate conversation between humans and machines requires something more than speech recognition, speech synthesis, syntactic and semantic analysis. Understanding language is difficult for computers and AI systems because words often have meanings based on context, which requires common sense and real-world knowledge. AI developers will need to find ways to emulate qualities like emotional and social intelligence. To get there, we would need to create algorithms that enable computers to learn from complete sensory experiences that are much like our own. At this point, no one has identified how to give machines those human skills— or if it is even possible.

Although the current systems don’t yet understand the meaning of language, by leveraging deep learning and AI, voice technology is improving quickly, and text-to-speech systems are becoming less robotic and more human. 

Voice tech will be essential for many future innovations

Many experts agree that the future belongs to brands that are ready to ‘assist’. Everything is set to be controlled with voice, and audio will become the new ‘touch’.  Because voice is a natural and logical way for us to interact, we see it integrated into every aspect of personal life and business. We may be far from talking machines, but the "skills" are maturing. Already we see a voice ecosystem forming and monetisation will soon follow. Amazon has already introduced Alexa purchases and subscriptions and has already started rewarding top-performing Alexa skill developers with direct payments.  Google Assistant is sure to follow.

As consumer and business expectations continue to rise, the most accommodating brands will thrive. We advise media and entertainment companies to start incorporating this emerging technology – particularly in voice searches and via intelligent assistants. It's already becoming clear that the software, not the hardware, will largely determine success in the voice-activated technology market. 

Sergey Bludov, Senior Vice President, Media and Entertainment, DataArt
Image Credit: Flickr / Best AI Assistant

Sergey Bludov
Sergey Bludov is SVP, Media and Entertainment at DataArt, the global technology consultancy. Bludov’s practice designs, develops and supports unique software solutions for clients in the entertainment, music, publishing, advertising and sports industries.