“Give me the latest weather report,” you say to your car. You listen, then direct it: “Turn the heat on to 70 degrees, and read me my stock quotes. Then play some Johnny Cash.” Sound futuristic? Yes, it does — even though it also feels like we should be there by now. That’s the testament to the difficulty of building voice-recognition systems.
However, that doesn’t mean innovative startups are giving up on that vision. VoiceBox, one of many companies presenting at this year’s Consumer Electronics Show in Las Vegas, is perhaps closer than any other company to making voice recognition simple and accurate enough for everyday devices.
The difficulty with voice software is that people have different voices and accents, not to mention different ways to phrase a command. In the above example, you might have told the computer to “get,” “report,” or “tell me” the stock quotes — any of which would baffle current software.
To add to the challenge, asking for Johnny Cash after stock quotes would probably send today’s applications off on a hunt for a publicly traded company called “Johnny Cash.”
The approach VoiceBox takes focuses on the actual meaning of sentences. The team has done cognitive mapping and studied context to come up with proprietary algorithms that help computers figure out what you’re actually talking about, rather than just responding to one or two words — as with the frustrating automated phone menus that airlines, banks and other businesses have come to rely on.
That differs somewhat from companies like Microsoft’s Tellme and Google, which are likely building large databases of voice requests in order to progressively understand more, a sort of brute-force approach that relies in part on having volumes of data to refer their own algorithms back to.
In a demo of VoiceBox’s technology, we saw that it’s not yet perfect, or even close to perfect. However, the program was able to respond somewhat intelligently to CSO Victor Melfi as he ordered it to search out different songs and switch between tasks, which is the real sticking point for most other companies. (See the diagram below for a better idea of various tasks the software might have to tackle.)
Although voice has been touted in the past with no solid results, we may actually be close to seeing consumer applications, says Jackie Fein, an analyst covering emerging technologies at Gartner. “I’d say they or a company like them could get there in the next couple years,” she told us, adding that “It’s hard because it’s based on algorithmic improvements.” However, constant advances in processor speeds make it ever easier for calculation-intensive software like voice recognition to run on modern computers.
Once voice software does become accurate enough for large numbers of people to use easily, the way will be opened for any number of new applications to be developed by other startups. At the moment, VoiceBox is partnering with companies like IBM, Toyota and XM Satellite Radio to further develop its software.
Although VoiceBox has been developing its technology for about five years, the company has not yet taken funding. It is currently considering taking a round of venture capital.