Voice technology that power devices like Amazon’s Alexa and Google Home is the next frontier for tech companies. Facebook’s announcement to launch ParlAI recently only intensified the industry’s ambition to reach the ultimate goal of having meaningful conversations with computers by voice. But let’s hold onto our horses; we’re not there yet.
At my company, we’re receiving increasing requests from brands eager to explore this emerging utility, and at the same time, we’re working with engineering and product teams to understand exactly what the technology can do. As it stands, we still have leaps to make, but one thing is sure: Voice-activated tech is set to get smarter, and fast.
Recently, I was speaking with a top executive at a leading global consumer products company. While watching TV, he saw an interesting ad for a product similar to his own. This prompted him to test Alexa. He asked what the best brand for the product category was, and Alexa promptly responded with a list of competitors. Later, one offered to send him a sample, and another listed best prices. This goes to show that while we may not yet be having meaningful conversations, voice-activated tech is rising and so are opportunities for brands to embrace it.
The good, the bad, and the promising
A recent study shows the U.S. market for voice-activated assistants has grown nearly 130 percent since 2016.
Today, Amazon Echo (Alexa) and Google Home — which differ from Apple’s Siri and Google Now in that they’re independent, stationary devices — dominate the market. Their main function is to provide a “smarter home” by calling up music, reminding you of your agenda, and even answering trivia questions.
One of the biggest benefits of voice-activated tech is that it saves time. Speaking is more natural than writing, and because you don’t have to take out your phone, it’s faster. It’s also more accessible for those who, for one reason or another, aren’t able to use keyboards or screens. Enthusiasts like Hugh Durkin, a product manager for Intercom, have even exclaimed: “Soon unnecessary typing and tapping on a keyboard will be a memory of the distant past.”
Perhaps. But this feature is still prone to error. When many people are speaking close to a device at once, it tends to have difficulty actually hearing the activation phrase. In the end, if you have to repeat your request again and again, it can be more time-consuming than just walking over to flip a switch.
There’s also the issue of privacy to consider. Burger King’s recent TV ad using “OK, Google” is a prime example of this. The ad used the wake word “OK, Google” to prompt devices to describe its burgers, but within hours of release — and hilarious edits to the Whopper Wikipedia page — the commercial was pulled. The widespread coverage of this ad highlighted the fact that voice technology is still new for many, and the idea of anyone, or anything, listening in on people is unnerving.
These issues are mere glitches, however. The biggest challenge is that although we’ve created processes that allow computers to get better at translation, voice recognition, and speech synthesis, most computers still don’t understand the meaning of language. Mark Zuckerberg said himself: “No AI system is good enough to understand conversational speech just yet. [It] relies on both listening to what you say and predicting what you will say next. Structured speech is still much easier to understand than unstructured conversation.” And, research confirms the average person is struggling to find value adopting voice tech in their daily lives.
Brands should prepare for tomorrow, starting today
The list of current limitations is long. Despite these drawbacks, advances in machine learning mean that computers are getting better at recognizing what people are saying. We’re not there yet, but Zuckerberg’s ambition of AI that understands conversational speech may not be far off.
In 2011, the global voice recognition market was valued at nearly 47 billion. Six year later, that figure has more than doubled to 113 billion. Along with Facebook’s new announced investment, there’s a rush to accelerate the transition from speech recognition to natural language processing at scale. Once this is achieved, Zuckerberg’s wish for computers to have more sophisticated conversations will become possible.
Brands can start preparing for this new frontier today. As my earlier example of Alexa demonstrates, soon more and more consumers will be turning to these products to compare options and make purchases. Brands need to anticipate this change now by integrating these devices in their ecommerce and marketing strategies. In much the same way online shopping transformed the brick and mortar retail experience, voice activation technology will take this to the next level.
Each day, the promise of meaningful conversation and results-oriented solutions provided by humans interfacing with computers is evolving. Let’s all continue to explore and contribute to these technologies as they become smarter and more meaningful…one word at a time.
S. Jason Prohaska is the Managing Director of MediaMonks, a global digital production company.