The smart home device market is poised for serious growth. Just ask analysts at IDC, which forecasts that shipments will experience a 26.9% year-over-year uptick in 2019 to 832.7 million units. By 2023, IDC expects nearly 1.6 billion devices will ship to customers’ homes worldwide.
Amazon is counting on it. Of the 75% of respondents to a recent Dashbot survey who use voice assistants like Alexa at least once a day, 23% say they control smart home devices with their assistant. Of that group, 63% tap assistants for home automation multiple times a day.
Perhaps it’s no wonder, then, that Alexa is becoming more proficient at controlling lightbulbs, garage door openers, smart locks, and other smart devices. In October, a few months after Amazon launched an API that gives Alexa the ability to communicate with motion and door sensors, the Seattle company introduced developer tools to connect smart cameras and doorbells to Echo devices. On the newly launched Echo Show 5, a discovery panel highlights popular smart home device tasks. More recently, Amazon demonstrated Alexa Conversations for seamless multi-turn interactions, bringing easy-to-remember commands, like “Alexa, start cleaning” to appliances like Roomba robots.
To better understand Amazon’s work in the smart home ecosystem as it relates to Alexa, we spoke with Nathan Smith, who heads up the customer experience team creating new features for Alexa smart home customers. Here’s a lightly edited transcript of our discussion.
VentureBeat: I thought we could start with a high-level overview of Amazon’s approach to the “smart home” and voice interactions and then dive into some of the ideas you and your team are pursuing to make managing connected devices easier with Alexa. That sound good?
Nathan Smith: Sure. We think the smart home is in a period of mass adoption and expansion right now. Classically, it has comprised much more tech-forward earlier adopters, but we’re past that. There are now more than 60,000 products that work with Alexa from 7,400 different manufacturers, and a trend we’re seeing is that Alexa is democratizing control of these devices.
One of the things I’m most excited about this year is a new feature that uses machine learning and artificial intelligence to help Alexa understand not just what you say, but what you actually mean, and then provide a simple user experience around that.
The problem we’re solving came from customer feedback as we were onboarding people who didn’t necessarily have context concerning which smart devices were named what around their house. We ran into this over and over again — people were having trouble remembering the names of devices, which was only exacerbated as they added more devices to their homes.
What we’ve done is make Alexa a little bit more human-like. If you ask Alexa something like “Hey, Alexa, turn on the Sofa Lights” but the lights you’re trying to turn on are called Living Room Lights and Alexa is uncertain about which you mean, she’ll helpfully suggest “Oh, you know, did you mean Living Room Lights?”
This technology, which allows people to speak more casually in their homes and go beyond the strict syntax that Alexa previously understood, helps in a lot of different real-world use cases. One is words that have similar transcriptions and another is mixed characters, like when people add emojis to their [own] or their devices’ names [in the Alexa smartphone app]. It can resolve words without being strict about the exact pronunciation, and it can even help in multilingual cases. If you’re using a mix of names across different languages, Alexa can learn from that.
The context is that we’re trying to build toward a world where Alexa understands you in a much more natural way, rather than training people to talk in Alexa’s terms. If we have a pretty good idea of what you’re saying, we’ll simply perform the intended task, but what we’re evolving toward is a model where Alexa gets ground truths from customers. We don’t want to take the power of customers away without asking a clarifying question if we’re not 100% certain about something, but we also want Alexa to be helpful in ambiguous cases.
We started rolling out this feature in the U.S. at the end of December and recently expanded it to Canada, Australia, the U.K., and India. In terms of early results, when Alexa prompts a customer with a suggestion, they’re accepting it 80-90% of the time, on average.
VentureBeat: Which other factors does Alexa take into account when determining how to respond to a command, misspoken or not?
Smith: Gathering ground truths and assimilating them into semantic and behavioral models that learn from you in a very human way — the way a child would ask questions about the world — underpins the machine learning side [of Alexa]. What our models really do is layer on signals in terms of device state and behavioral signals — like which devices are usually switched on at which times — in addition to environmental signals, like date and time. The models use all of these to generate suggestions.
There’s a lot more work to do, and we think that we can expand the reach of this sort of helpfulness to other scenarios. We’re seeing more and more customers from different walks of life and different technology backgrounds using smart home devices with Alexa, and this is a first step to taking bleeding-edge technology and using it to help simplify the customer experience.
VentureBeat: AI and machine learning are obviously at the core of Alexa, from its language processing and understanding to the way it intelligently routes commands to the right Alexa skill. What are some of the other challenges you and your team are solving with AI? What has it enabled you to achieve?
Smith: At the feature level, there’s Hunches, where Alexa provides information based on what it knows from connected sensors or devices. It checks if when you say a command such as “Alexa, good night” whether your garage lights are still on and whether they’re usually off at that time of day, which informs the response. Alexa will say something like “Good night. You know, by the way, I noticed that your garage lights are on. Would you like me to turn them off for you?” and give customers helpful feedback at certain stages of smart home routines without requiring them to dig into a bunch of app screens.
These features use machine learning techniques enabled by Amazon Web Services. We run these real-time capabilities at scale on the SageMaker platform, which has given us the ability to iterate a lot more quickly.
VentureBeat: It seems, as you said a moment ago, that smart home adoption is on the rise, perhaps driven in part by cheaper connected devices, like Philips’ recently announced Bluetooth-compatible Hue series. What are some of the other ways you’re making onboarding simpler for first-time buyers?
Smith: We’ve been working really hard on that for a while now, and one of the things we’re most excited about is this ability to have a zero-touch setup. Last year, we announced Wi-Fi Simple Setup, which lets you quickly configure Amazon Wi-Fi devices like the Amazon Smart Plug. Basically, you plug it in and then Alexa will say “Hey, I found your new device.” There’s no other setup necessary. We’re bringing that same experience to Bluetooth Low Energy light bulbs like the new Philips Hue products, and we’re really working to expand the usage of this technology broadly.
As for configuration post-setup, once you get a device talking to Alexa, we released a couple of features at the end of last year that help you do some of the other setup and context-gaining by voice that you might need to have a fully natural interaction with Alexa. We want customers to be able to do things like put their devices in rooms so that when they refer to one device in a set of several, Alexa targets the right device.
That’s why we rolled out last year a more contextually sensitive setup experience. If you say “Alexa, turn on the lights,” she can walk you through with voice setting up a room and putting lights in there. We’ve seen customers really take to this because it doesn’t get in the way of controlling the device for the first time.
VentureBeat: I’m sure you have to account for different Alexa device form factors, right? I’m talking about an Echo Dot versus an Echo Show.
Smith: We think of it as a mesh among the different modalities — among the app, voice, and screen — because each has different strengths. Voice is really great when you’re trying to do something hands-free, but not great when you’re trying to do something quietly. That’s where we lean on screen-based interactions.
What we’re really excited about is ensuring that, as more diverse customers start to use Alexa, we’re keeping up with their needs and not looking backward and saying “OK, how do we teach these customers the sort of patterns of the past?” Instead, we’re using technology like machine learning to look forward and learn from them.
The key is using the technique that’s right for the type of problem, whether it’s examining a behavioral pattern or trying to establish semantic similarity with ground truths, and then tuning a meta-model that takes those individual signals into account, producing a user experience that’s helpful instead of one that makes assumptions.