“Integrating Voice Everywhere: The Rise of the Personal Assistant”: Moderated by Peter Mahoney, CMO, and Gary Clayton, Chief Creative Officer, Nuance Communications
William Shatner (aka Captain James Kirk) was able to have a conversation with his spaceship in Star Trek. The ship’s computer had a pleasant voice, much like Nuance’s Siri. Kirk’s computer could carry on a conversation and respond to follow up questions. It could also process the rich context of what the ship was encountering in space and filter out the most pertinent information. This information was reported in real time to the Captain. The computer filtered its information and context based on information that would help the captain protect the welfare of the ship (which is an example of a device domain).
How close is the industry to Star Trek and a personal assistant as effective as “Computer” was for Captain James Kirk? When will our phones be able to alert us before paying for a meal that we should use our rewards credit card for dinner in order to get United Miles? When will our devices be able to use the rich context in our emails to remind us to contact colleagues and arrange important meetings? And when will our devices be able to advise us to leave earlier for our daughter’s soccer game because there has been an accident on 520 West?
Currently, Peter Mahoney says that voice recognition personal assistant devices are in a state of “helpful yet dumb [subservience].” They are domain specific (entertainment, dining or lodging are all other domain examples) and they have the ability to access and use other systems like ‘Open Table’ in an “intelligent” manner. However, they lack follow up question capability.
The near future looks promising. According to Nuance, “Next generation” (aka within a year), Siri should be able to manage a dialogue and ask appropriate follow up questions. This is much better than the current “fail-safe” question that asks: “Can I search the internet for you?” Implicit learning will also enable systems to learn from users’ responses. If a user does something twice, the device will jump to that option first.
If a generation of technology comes full circle in one year, what can we expect in terms of evolution over multiple generations of personal assistant systems? What are the great minds in the industry focusing on for the personal assistant products over the next 10-15 years?
Nuance’s leaders see the future in what they term “natural language assisting capability.” This is the ability for every device everywhere (whether car, tv, mobile device, etc.) to understand the same language. They also see the future in using information from different geographical domains. For example, currently Dragon-Go is able to use Open Table to find restaurants in USA cities but it cannot branch out to services like Open Table that are available in Australia. Lastly, leaders wonder if they can “prime” a third party system like Open Table with information from a body of knowledge such as an email database.
Drawing relevant information about specific tasks and domains from large information databases like a personal email account is a great future challenge. How can a system be created to “encode” a source that vast, let alone pick up on the language subtleties necessary to make helpful suggestions as a personal assistant? People refer to these subtleties as “implicit” indicators as opposed to an “explicit” direction that says, “Find this” directly to the assistant.
The most important question to ask is: how does Nuance make the jump from having helpful but dumb personal assistants to James Kirk worthy personal assistants? First, Nuance is working on its ability to provide domain depth by enlisting the help of tens of thousands of developers. This raises the issue of whether or not these developers are truly up to the task of creating systems that can work with each other as well as on their own. This issue is so important because the majority of leaders see the future in integration of multiple devices that can communicate with each other, the performance of domain specific tasks as well as the ability to understand the same spoken language. Integration for the future is placed in opposition to one assistant that stores all of the contextual information (like speed of your car, rate of your heartbeat, and time of the day) in a single database. Cumbersome is the word that comes to mind with the single database option.
An important issue in integration is open versus closed (vertically integrated) systems. The fear is that opening a system invites risk of service quality degradation. For example, what if Siri enlisted the help of an internet search browser for a piece of information that came back with the wrong answer? Siri cannot control its own product quality because it is “open” to another system and cannot guarantee the accuracy of answers provided through external search browsers.
Nuance asserts that it cannot possibly develop the “Star Trek computer” personal assistant alone and that other industry players are sure to arrive. This point is well taken, but currently Nuance is the industry leader. Why not use this advantage to control the supply chain of third party sources and develop a system of marketplace standards for third party sources?
James Kirk says, “Food for thought. Computer. . .full speed ahead.”