Did you miss a session from GamesBeat Summit 2022? All sessions are available to stream now. Watch now.
Last July, Amazon announced the beta launch of Alexa Conversations, a deep learning-based way to help developers create more natural-feeling apps for Alexa with fewer lines of code. Today marks the general availability of Conversations in U.S. English, which Amazon claims is the first and only AI-based dialog manager for voice app development.
The pandemic appears to have supercharged voice app usage, which was already on an upswing. According to a study by NPR and Edison Research, the percentage of voice-enabled device owners who use commands at least once a day rose between the beginning of 2020 and the start of April. Just over a third of smart speaker owners say they listen to more music, entertainment, and news from their devices than they did before, and owners report requesting an average of 10.8 tasks per week from their assistant this year compared with 9.4 different tasks in 2019. According to a new report from Juniper Research, consumers will interact with voice assistants on 8.4 billion devices by 2024.
Alexa Conversations, which was announced last June in developer preview at Amazon’s re:MARS conference, shrinks the lines of code necessary to create voice apps from 5,500 down to about 1,700. Leveraging AI to better understand intents and utterances so developers don’t have to define them, Amazon also says Conversations reduces Alexa interactions that might have taken 40 exchanges to a dozen or so.
Conversations’ dialog manager is powered by two innovations, according to Amazon: a dialogue simulator and a “conversations-first” modeling architecture. The dialog simulator generalizes a small number of sample dialogues provided by a developer into tens of thousands of annotated dialogues, while the modeling architecture leverages the generated dialogues to train deep-learning-based models to support dialogues beyond the simple paths provided by the sample dialogues.
Developers supply things like API access and entities the API has access to, in effect describing the app’s functionality. Given these and a few example exchanges, the Conversations dialog manager can extrapolate the possible dialog turns.
In the time since the beta’s launch last summer, Amazon says it has added improved error messages, dialog cloning, command-line interface support, enhanced authoring workflows, and an updated app design guide. For example, the Alexa Conversations Description Language, which launches in beta this week, allows developers to author Conversations dialogs in a declarative manner with type safety, semantic validation, modularity, and reusability. It works with the Alexa Skills Kit Command Line Interface and supports a syntax highlighter extension for Visual Studio Code.
There’s also the new Alexa Entities feature, which lets developers resolve strings in a customer’s command to entities from Alexa’s knowledge graph, including people, places, and things. Developers can use these entities as an entry point to traverse Alexa’s structured knowledge. And connections between entities can create natural dialogues like “Add Alias Grace to my reading list,” “Got it. Have you thought about reading The Testaments, also written by Margaret Atwood in 2019?” Alexa entities can also be used to fetch information about movies (e.g. films by Quentin Tarantino), countries (the population and capital of Belgium), animals (the average weight of a hippo), and more.
Conversations’ first use case, demoed in 2019, seamlessly strung Alexa apps together to let people buy movie tickets, summon rides, and book dinner reservations. (OpenTable, Uber, and Atom Tickets were among Conversations’ early adopters.) In light of the pandemic, that scenario seems less useful. But Amazon says it merely illustrates how Conversations can combine elements from multiple apps without much effort on developers’ parts. Companies like iRobot and Philosophical Creations (which publishes the Big Sky app) are already using it.
Now the number of developers who’ve engaged with or published voice experiences using Conversations has grown to “thousands,” Amazon says, including iRobot and The Art Institute of Chicago. Teams at Amazon and Amazon-owned Ring tapped Conversations to build Alexa’s What Should I Read Next and Alexa Greetings features, respectively. What Should I Read Next helps Amazon customers use Alexa to browse books and find new recommendations, while Alexa Greetings helps Alexa determine intention and take actions like delivering instructions or taking a message.
“Our customers can now schedule cleaning jobs in their own words,” iRobot CTO Chris Jones said in a press release. “And since Alexa manages the dialog, we can improve the experience by retraining the AI instead of rewriting and retesting skill code. This has fundamentally changed our voice development process, and we’re excited by how this speeds up innovation for iRobot customers.”
VentureBeat's mission is to be a digital town square for technical decision-makers to gain knowledge about transformative enterprise technology and transact. Learn more about membership.