New voice-enabled devices have proven to be a breakout hit with consumers in 2016. For businesses embracing innovation, this opportunity represents a land grab as large and important as when Apple first opened the iOS App Store in 2008. Google now processes roughly 40,000 search queries every second. This translates to more than 3.5 billion searches per day and 1.2 trillion searches per year worldwide. ComScore predicts that by 2020, 50 percent of all searches will be voice searches.
The recent success of products such as Apple Siri, Google Assistant, Amazon Echo, and Microsoft Cortana heralds a future where a multitude of voice and chat assistants will soon be on-command to assist with many daily tasks.
Until now, companies looking to build advanced voice or chat assistants have had few options. Open source or academic AI toolkits (such as TensorFlow, CNTK, SyntaxNet, CoreNLP, or NLTK) provide advanced algorithms but little data suitable for production-quality applications. Cloud-based developer tools (such as Facebook's wit.ai, Google's api.ai, Microsoft's LUIS, or Samsung's Viv) provide only limited support for building AI models for custom knowledge domains. As a result, consumers have been underwhelmed by the first generation of voice and chat assistants.
Deep-Domain Conversational AI (for example, MindMeld) promises to power a new generation of AI assistants that can streamline common daily tasks such as placing a take-out order at a local restaurant, booking a flight or hotel reservation, creating a service appointment at an auto repair shop or doctor's office, or finding retail store and product information. The AI allows large-scale language-understanding and question-answering capabilities on apps and devices for any custom content domain. It enables companies to leverage their own proprietary data collections towards creating cutting-edge voice experiences that surpass the performance of general-purpose, consumer voice products like Siri, Cortana or Amazon Echo.
"Over the next five years, Conversational AI will transform how consumers interact with products and services provided to them across industries," said Dave Schubmehl, Research Director at IDC. "The companies that can build the best conversational interfaces will leapfrog ahead of the competition by delivering superior customer service with unprecedented operational efficiency."
Just in the past few months, companies like Google, Facebook, Apple, Amazon and Microsoft have launched new developer APIs for their most popular messaging applications, virtual assistant platforms and voice-enabled devices. As a result, companies that succeed in building useful voice and chat assistants can now reach billions of new users.
Many reviewers are noticing that Google Home is “smarter” than Echo and has the ability to understand more complex questions. Unlike the Echo, Google Home can understand context for follow-up questions; it will answer correctly when an asker follows “what’s the weather” with “what about tomorrow,” for example. But Amazon Echo leads in range of function; its historically open relationship with developers has allowed it to amass over 3,000 skills. Google Home, on the other hand, only opened their SDK this month and its power to integrate with IoT devices and apps is still limited.
Google and Amazon’s investment in voice-enabled technology is backed by strong sales figures across the board. Amazon doesn’t release its sales numbers, but a November 2016 estimate by Consumer Intelligence Research Partners reports that the company has sold more than 5.1 million Echos in the United States alone since its debut. That’s comparable to the early iPhone sales figures, raising the question of whether voice-interfaces will grow to the same level of popularity as touch-interfaces. Unsurprisingly, awareness of the Echo and devices like it continue to increase, jumping from 20% to 69% of Amazon users in the span of 16 months. Amazon doesn’t release sales figures. However, the Echo sold out in the 2016 holiday season, and analysts predicted that 10 to 12 million virtual assistants could have sold by Christmas. In fact, after the holidays Amazon reported that "customers purchased and gifted a record-setting number of devices from the Amazon Echo family with sales up over 9x compared to last year’s holiday season and millions of Alexa devices sold worldwide this year." Google Home also sold out in stores.
Although the sales numbers show promise, the future of voice interfaces depends on the ability to answer increasingly complex questions and integrate with existing apps. Google and Amazon’s vision is that new devices will enlist a multitude of specialized services to assist with daily tasks. To order take out, one could simply say ‘OK Google. Ask Caviar to deliver Kung Pao Chicken from China Cafe.’ To request a ride, say ‘Alexa. Ask Lyft for a ride to JFK.’ Speaking in natural language is 3-4x faster than typing, so the addition of voice to consumer devices could feel like a more natural extension than manual input. The best-case scenario? A world where voice technology is more accurate than touch screens and buttons ever could be, accepting an unlimited number of inputs and getting smarter with each addition.
Voice interfaces are rapidly gaining popularity as the fastest and most efficient way to find information. Businesses will need to embrace this technology to automate requests for product information, customer support, and more on their apps, devices and websites. It’s unclear yet who will dominate voice-controlled technology, but the company that does will create a multi-platform solution that integrates seamlessly with the devices and apps we already use, provided those apps have voice too.
Google and Amazon are at the forefront, but Microsoft’s plans to open Cortana to third-party hardware will bring another player into the mix. We look forward to 2017, when we will hopefully embrace the full power of voice.
Image Credit: Amedley / Shutterstock