We queried Neil Grant, Regional Sales Director for Nuance, about the future of text-to-speech and speech-to-text solutions and how the company, which is best known for its Dragon range of products, envisages the competitive landscape. The interview was carried out before Nuance announced the acquisition of Swype and Loquendo and Apple's release of the iPhone 4S which features Siri technology.
Can you provide us with an overview of what Nuance does?
Nuance Communications is a leading provider of speech and imaging solutions for businesses and consumers around the world. Its technologies, applications and services have been developed to transform the way people interact with information and how they create, share and use documents. Nuance has enjoyed considerable success for many years with its established imaging solutions products including OmniPage, PaperPort, PDF Converter and PDF Converter for Mac.
However, Nuance is probably better known for speech recognition technology, particularly its Dragon family of desktop speech recognition solutions, including Dragon NaturallySpeaking 11.5, and Dragon Dictate for Mac 2.5. Beyond the desktop, there are many current applications that also use Nuance's Dragon speech technology. Its speech solutions have been included in more than four billion mobile phones and 35 million cars. Millions of mobile consumers have downloaded Nuance's Dragon Mobile Apps, while more than 3,000 hospitals and 330,000 physicians use Nuance clinical documentation solutions.
Things have moved on a lot since the early days of Speech to Text and voice recognition in general; can you give us a quick glance at what's happening in the world of Speech to Text?
Speech recognition takes a lot of computing power to do well and to do quickly in order for it to be really commercially viable. Only over the last few years have computers had the additional power to run the complex processes required to accurately and easily recognise speech.
Thanks to the resources that Nuance has constantly invested in speech technology research and development, today's Dragon user experiences quicker, more accurate results. It's even easier to use and more accurate, and winning new fans thanks to dramatically reduced enrolment times and the ease with which it enables users to control popular PC applications, search the internet, create and send emails, and of course dictate documents. Most recently, in response to how mobile device usage is changing, Nuance introduced a free Wireless Microphone app, which enables Dragon users to use their iOS device as a wireless microphone interface to Dragon.
What are the opportunities and challenges in the market?
One of the former challenges we faced was that perceptions of speech technology were founded on experiences of very early speech recognition systems. If you wanted to draw a comparison with respect to the progress that's been made, speech technology has enjoyed the same quantum leap in performance and acceptability as the diesel engine has in the last ten years. When people try speech recognition today for the first time, they are astonished at how easy it is to use, and how accurate it is.
In the meantime, Nuance's Dragon mobile applications have been very successful in bringing speech technology to even more users, and without a doubt they have played an important role in accelerating the acceptance of speech as an effective interface, and challenging any perceptions about speech.
With respect to opportunities, Nuance desktop speech solutions have been incredibly successful establishing themselves as essential tools in professions including legal, healthcare, media and education. Nuance has enjoyed great success in the assistive market, too. We receive many testimonials from users that have had their lives transformed by Dragon and the role it has played in helping them to regain their independence.
More recently, we've seen more interest from students as well as education institutions as they see the benefit of using Dragon to help with dissertations, reports and coursework, as well as updating social media profiles, sending emails, and chatting on Skype. The less time they spend writing, the more time they can spend learning or socialising! At the other end of the scale, we're also seeing interest from the senior living market, who want to enjoy all of the communication and information access benefits of using a computer, but who might not feel comfortable using a keyboard and mouse. There really is a huge variety of usage, from novelists, to people who use Dragon to record their family history, or use it to dictate inventory reports. In fact, every year, Nuance hosts a free to enter 'I Speak Dragon' competition, which invites Dragon users to send us their stories about how they're using Dragon. The 2011 competition opens in October, and I'm sure that once again it will prove to be a great source of inspiring and fascinating user stories.
Despite enjoying high levels of customer satisfaction, we can't stand still. Nuance is always working to improve Dragon. Speech recognition is a statistical process, and the more data we can process, the better the recognition accuracy is going to be. It is something that we have to keep on top of, because there are new words filtering through day to day language.
How important is mobile for the company?
Mobile is important for Nuance because mobile devices are important for our customers, whether they're being used for personal or business use. Increasing - especially in the case of smartphones - that usage division is blurring, as personal devices are used for work. Nuance's mobile applications have been designed to add value to the user experience.
Currently, Nuance produces applications and technologies for iOS, Android and Blackberry. The mobile applications are important products for Nuance, with respect to raising awareness of the brand, and demonstrating the accuracy of today's speech recognition technology in order to get speech into the hearts and minds of the mass market.
How far are we from a truly real-time, 100 per cent accurate, no-training needed TTS solution?
That's a good question. Our technologies are constantly being developed. Accuracy is not something we're ever going to sit back and take for granted. We measure accuracy in terms of word error rate. So if you're creating a 1,000-word document, 99% accuracy would mean ten errors, and that's something we can and will work to improve on. Other further improvements users can expect, might be around natural language and understanding what people want the computer or device to do, without them having to learn exact commands. What we can assure current and future customers is that they will continue to see further recognition accuracy improvements with each new version or application.