Does voice-to-text technology actually work?

No matter how fast we learn to type, it’s never going to catch up to how fast we can speak. Luckily for us, technology might mean it doesn’t have to. Voice-to-text technology is entering more and more industries, and has the potential to make recording, dictating and communicating a whole lot faster.

Take ShoutOUT, a new iPhone messaging app released to the App Store last week.  It’s taking advantage of huge improvements in voice to text technologies to make SMS on the iPhone a much easier prospect, especially for those on the go. What ShoutOUT wants to do is take over the SMS capabilities of your iPhone. It allows users to speak a text message, and then ShoutOUT’s computers transcribe the message and send it to whoever you specify.

listenBut what’s potentially interesting about ShoutOUT is the same thing that’s on the minds of companies like Jott, Dial2Do, and even Google –- using your voice to make tasks that require typing, go a whole lot faster. Whether it’s reading news, taking notes, or even sending email, it can all be done faster with your voice. Not to mention all the upsides of an alternative to texting while driving, or even walking—you get run into by a texter on the street, you start to wish for better voice-to-text technology. Text, by virtue of being readable, is also easier to see at a glance, search through, and organize for later reference.

The knock on all these services, from Dragon Dictation to Jott to Google Voice, is that there’s a huge learning curve to get to know the app, train it to your voice, and use it properly– and often the apps just don’t work. Some apps don’t understand punctuation, so you have to say things like “I went to work this morning comma had three meetings comma one of which went for three hours exclamation point”. It’s not exactly a natural way of speaking. But some people swear by them.

There are a lot of apps out there trying to take advantage of your voice and improvements in speech recognition, and the possibilities are undeniably huge, driving acquisitions like Nuance’s purchase of Spinvox, another voice-to-SMS application, for $102.5 million. But the question is, do any of the applications actually work?

To find out, I signed up for several of the biggest players in the voice-to-text field, and gave them all a simple test: “Hello everyone, I’m David Pierce. I have three legs, two arms, fifty-one toes, and am more fun than a barrel of monkeys. I love to skydive, run around in circles, and wear my New York Giants shirt.” It’s got proper names, numbers, and some odd words, so it should be a decent comparative measure of how they do. Here’s a look at a few of the results:

Jott

Made by: Nuance Companies

Cost: $3.95/month and up, depending on how you use it

Use it for: Sending reminders to yourself, managing task lists, transcribing voicemail

Message Transcription: “Hello everyone, I’m David Pearce. I have 3 legs, 2 arms, 51 toes and a more fun than a barrel of monkeys. I love to sky dive, run around in circles and wear my near giant shirt.”

Dial2Do:

Made By: Dial2Do

Cost: Limited Free version, $3.99/month for unlimited use

Use it for: Reading news, sending and receiving emails, sending reminders to other apps

Message Transcription: “Hello everyone. I’m David Peirce. I have 3 legs 2 arms. 51 toes and more fun at the (?) of monkeys. I love the sky dust run around the tickles and run my near time shirt.”

Google Voice

Made By: Google

Cost: Free (ad-supported)

Use it for: Transcribing voicemail, managing SMS and voicemail online and by email

Message Transcription: “Hello everyone, I did pierce. I have 3 lakes to Arms 51 toes and in more fun than available. Peace. I love to Scott, I’ve run around in circles and where my New York Giants shirt.”

Dragon Dictation

Made by: Nuance Companies

Cost: Free App Store app, paid versions of Dragon Dictation desktop software.

Use it for: Anything! Copy and paste to most other apps, on iPhone or computer.

Message Transcription: “Hello everyone and it appears I have three legs to arms 51 times and more fun than dial up monkeys I love to skydive run around in circles and where my New York Giants shirt”

ShoutOUT

Made by: Promptu Systems

Cost: $.99 app in App Store, pay-per-message approach for transcribed message

Use it for: Replacing the SMS feature on your iPhone, texting while driving

Message Transcription: “Hello everyone I’m David Pierce I do you like to to arms to see one toes in a more fun the girl monkeys I love to start I run around in circles and where my New York giants right”

As is made all too clear, this technology is far from perfect. People speak too differently, in everything from speed to accent to pronunciation of certain letters, and creating a technology that works for all of them is an incredibly tall order.

But the potential is great—Dial2Do and Jott let you do things like hear news, send tasks to yourself, and even update a blog; ShoutOUT might be able to make texting a little faster, and a whole lot less dangerous. There have been great improvements, but there’s still a long way to go before your voice can really replace your keyboard.

My fingers are sure crossed.

blog comments powered by Disqus