Driven by advances in machine learning, conversational AI like the kind at play in Alexa, Google Assistant, and other prominent virtual assistants is only going to grow in adoption as the biggest tech companies in the world compete to put their assistants in your home, workplace, and car.
Today people are more likely to use assistants for the basics like checking their calendar or the weather, but also for controlling home devices, replacing their TV remote, and performing voice search and shopping.
With smart speaker adoption expected to grow sixfold in the years ahead, there’s no signs of that momentum slowing down.
Through years of development, Google Assistant lead engineer Scott Huffman says his team has learned some lessons along the way, including five essential rules to follow in the age of voice computing.
Huffman laid out those rules in a blog post today and onstage at Transform, a VentureBeat AI event held August 21 and 22 at The Seminary at Strawberry in Mill Valley, California.
Huffman’s opinion on the matter is worth following. Back in May, Huffman announced the Assistant is now available on 500 million devices, and data from Canalys released last week found that Google Home speakers outsold market-leading Echo speakers worldwide for the first six months of 2018.
1. Voice is about getting stuff done
The vision of an all-powerful, intelligent assistant that acts both as companion and conversationalist isn’t quite there yet. Interactions with Google Assistant are 40 times more likely to be related to taking an action than a typical Google search, Huffman said.
“People have been asking Google for years about everything you can imagine but we noticed a real fundamental shift when we moved to voice. People started asking us to do things, not just to get answers,” he said.
People are most likely to ask Google Assistant to do things like make calls, set reminders, locate their smartphone, or set their phone to Airplane Mode. Users are often busy when interacting with an assistant, and a hands-free interface is complementary to that.
Huffman believes we’re at the beginning of a major computing shift, and the sorts of things you can expect your assistant to help you with will improve over time.
“I kind of feel like where we are with voice reminds me a little bit of where we were with the web 25 years ago,” Huffman said. “The tools are a little clunky, the interactions, it’s not always entirely clear what to build, but it’s clearly onto something big, and so I just encourage you to dive in and build some conversations.”
In Google’s continued effort to help people get stuff done, Huffman introduced a series of features onstage at the I/O developer conference, including the ability to make multiple actions in a single sentence and the ability to ask follow up questions without the need to say “OK Google.”
Starting today, Google Assistant users can also say “Tell me something good” to hear doses of good news.
2. Voice does not mean the end of screens
Back in 2016, Siri cocreator Babak Hodjat posited that the increasingly popular Amazon Echo speakers would hit a ceiling without a screen.
The original and still most popular line of Echo and Home speakers may have lacked screens, but Huffman says screens change everything. For example, Google Assistant has been able to help you cook a meal with step-by-step instructions for more than a year now, but Huffman admits it’s a little hard to do with a speaker alone.
“The point of not being overindexed on the speakers alone is actually a point that i make to my team all the time,” he said.
The benefits of visuals are also abundantly clear when shopping, looking at lists, or getting directions sent to your smartphone.
They can also help people connect with local businesses, both from in-car infotainment displays and from other devices that have a screen and access to Google Assistant. “I think as we move beyond device-level command and control, the next one we’re really seeing coming is more local services,” he said.
Google Assistant is getting smarter about how it operates on visual surfaces. Besides crafting hands-free TV control, developers using the Actions on Google third-party platform have gained the ability to create visual voice apps.
Visual snapshots of your day began in July.
As part of a series of changes introduced for Google Assistant in recent months, besides the beginning of experimental Duplex AI trials for scheduling reservations and six new voices, Google’s Lens computer vision on Android smartphones has gained real-time analysis abilities and new features like Style Match for fashion intelligence.
Rumors have emerged in recent weeks about major potential next steps for Google Assistant in the visual space. Google is reportedly working on its own smart speaker with a screen to compete with Amazon’s Echo Show and third-party speakers on the way from Sony, LG, and JBL. The 8- and 10-inch Lenovo Smart Displays with Google Assistant hit store shelves July 26.
Google is also reportedly planning to release the Pixel 3 this fall with a charging case that allows users to interact with Google Assistant while the phone is charging.
3. Daily routines drive adoption
One of the most effective ways to become a regular fixture in users’ lives is to become part of their daily routine, as made clear by some of the most popular Alexa skills, according to data from voice app developers.
Google Assistant users, for example, are likely to ask for news and weather in the morning, message friends in the afternoon, and play music or cast video in the evening.
Google introduced the Routines feature in March for people to create custom commands, so you can say “OK Google, I’m home” to turn on your lights, change the temperature, and perform other regularly used actions. Amazon released a similar feature for Alexa in October 2017.
4. Voice requires no user manual
A natural language interface means virtual assistants require no training or user manual to know how to use them. “Google Assistant really defies the early adopter stereotype. So women are our fastest growing segment of users; seniors and families are having huge adoption of the technology; again, really defying that early adopter stereotype,” Huffman said.
It’s also demonstrative of what happens when you don’t need to learn how to use an operating system, in India, Google Assistant usage has gone up 3 times this year compared to 2017.
“As brand new users come onboard that haven’t been online before, what we’re seeing is they bypass some of the traditional app and silo-based ways of accessing services and are actually jumping straight to the voice assistant,” Huffman said.
5. Conversational AI is, well, conversational
Though people focus on getting things done with voice assistants, that doesn’t mean they don’t get chatty.
People are 200 times more likely to chain together multiple commands or questions that are contextually related to each other with Google Assistant compared to Google search because expectations about what your assistant can do are growing.
“People are expecting real conversations, so with any technology when you start out, people maybe their expectations are low, but as it starts to work, people’s expectations go up quickly. So what we’re seeing today is people have more and more complex conversations with voice technology,” he said.
This may not translate into considering your assistant to be a friend or a cure for loneliness, but the growing confidence does mean people expect their assistant to be able to handle a command in the many variations in which they can be phrased in natural language.
“Even with a simple thing like setting an alarm, people ask daily 5,000 different ways how to set an alarm alone,” Huffman said.
In time, Huffman hopes Google Assistant gets smarter about the person with whom it’s speaking. For example, someone speaking with the Fandango voice app to order movie tickets might want Google Assistant to remember you have a wife and kids coming along as well when it recommends movies to see.