With the release of Apple’s HomePod, unveiled at WWDC 2017, the race has officially started — we are all waiting to see who will win as the leader in the voice-activated intelligent speaker market.
And while the early adopters get to work integrating these devices into their lives and the late adopters drag their heels because of security fears, we might all still agree that this trend toward voice-activated technology is not showing any signs of slowing down. It’s been around for decades, but recent years have seen it trickle down into our cars, phones, fridges, and even lamps. Perhaps home domination by these intelligent speakers and their virtual assistants is a sign that we’re on the tipping point of another technological revolution.
After all, underpinning the voice-activated tech race is a subtle but powerful message that both warns and heralds that screen time is vanishing.
Why screens cannot compete with the human voice
According to the American Linguistic Society, when we use speech as a way to convey and gather information, we are tapping into a deeper, more primal part of our humanity. While we’ve only been writing for roughly 6,000 years, we’ve been speaking for much longer than that. Children are talking by the time they’re two years old, but writing (especially legibly) takes much more time. Truly, human-to-human interaction through speech is the original user interface.
It makes sense, then, that when we want to quickly check the weather, the sports score, or our flight departure time, we should ask out loud, rather than load a webpage, type in a search, sift through the results, and read the information.
Granted, there are some search terms for which we would not want the results announced out loud (for example, “find nearby jewelry stores that sell engagement rings”) and others where the results will be visual (“show me designs for engagement rings”).
But for many day-to-day tasks that require general information, voice-activated tech provides a quicker way to organize information on the fly and get the data you need to move on with your life.
Who will win the home market for intelligent speakers?
The fact that these devices are making a play for a natural interaction between humans and technology means that the provider who is able to get closest to a true human exchange, with accurate results and functionality, will win.
Specifically, can the speaker tell the difference between a child’s voice and an adult’s and adjust its results accordingly? Will it be able to send written (speech to text) messages to others that express the intention accurately, complete with correct punctuation? How apparent is it to the user that they are interacting with a machine?
Amazon has programmed Alexa (the voice of their Echo intelligent speakers) with some humorous responses, a way for the device to show “personality” and express some human subtlety. Google has employed the mind of an ex-Pixar storyboard artist to help create Google Assistant’s personality. And Apple has announced that Siri is also getting some vocal improvements — namely, more natural elocution, and even adding context to the responses Siri gives, thanks to the bot’s ability to gather data from your calendar, location, and current activity on your phone.
Not only are these devices attempting to erase the screen interface, the companies behind them want you to feel that the technology is gone too, replaced with an authentic human experience using AI.
Challenges in a voice-activated future
While it would be foolish to think that there aren’t intense algorithms and tech enhancements that are helping guide the tone of the device’s response, it’s worthwhile to remember that there are still limitations to what can be achieved with the synthetic voices that some of these devices are using (Apple’s Siri is currently voiced by Susan Bennett).
As voice-activated tech continues to expand and more and more companies take advantage of this new kind of interaction, it’s likely that brands will find themselves searching out voices that sound like them — not just in vocal quality, but in demeanor.
Brands will also have to pull out all the stops when it comes to creatively designing how they will integrate voice naturally into both their products and their websites. Elements such as UX/UI design, and the scripts it operates on, will become all the more crucial as the screen dissolves and the human voice once again reclaims center stage.
David Ciccarelli is the CEO of Voices.com, an online marketplace of voice actors.