Amazon drops Alexa skills recommendation error rate 12%

Amazon's Alexa assistant has lots of skills -- more than 50,000, at last count. That makes it tough to discover new ones if you're not sure where to start, but fortunately, scientists at the Seattle company are tackling the problem with artificial intelligence (AI).

In a blog post this morning, Young-Bum Kim, a data scientist at Amazon's Alexa AI division, detailed the machine learning system that automatically selects the best skill to handle a particular request. Recent modifications made to it -- the results of which will be presented at the 2018 Conference on Empirical Methods in Natural Language Processing this week in Brussels -- noticeably decreased errors.

As Kim explained, the model comprises two neural networks, or layers of mathematical functions that mimic the behavior of neurons in the brain.

The first -- dubbed the "shortlister" -- produces a list of candidate skills that might be appropriate for a given request, taking into account skills already linked to the requester's Alexa account. (Kim notes that linking is a strong corollary for preference.) Meanwhile, an "attention mechanism" dynamically assigns a weight to each of the linked skills, modifying the probability any one of them will make it onto the shortlist.

The second uses more detailed information -- including whether the skills' developers indicated which actions their skills are able to perform in metadata -- to choose among those skills.

Previously, Alexa researchers trained the shortlister network end-to-end; every component of the network was evaluated based on how it contributed to the accuracy of the output. But the newly improved AI model also considers intended skills -- i.e., linked skills invoked when a user requests something -- in determining probability. As a result, the network now more reliably selects linked skills when a user intends them, Kim wrote.

To test the improved AI system's robustness, the Alexa AI team tested three different versions that used two distinct functions to generate the weights applied to linked skills -- softmax, which generates weights with values between 0 and 1 that must sum to 1, and sigmoid, which also produces weights ranging from 0 to 1 but that has no restrictions on their sum. (The previous version of the shortlist neural network used softmax exclusively.)

The best-performing model of the three reduced the error rate by 12 percent when tasked with producing shortlists of three candidate skills, Kim wrote.

Amazon's use of AI extends beyond skills selection. Its context carryover model allows Alexa to understand "multi-turn utterances" -- in essence, follow-up requests with explicit pronoun references (for example, "Alexa, what was Adele's first album?" "Alexa, play it.") A separate AI system allows Amazon's Echo speakers to recognize up to ten distinct user voices. Moreover, back in November, Amazon's Alexa team said it's beginning to analyze the sound of users' voices to recognize mood or emotional state.

That's just the tip of the iceberg. In August, the Alexa Machine Learning team at Amazon made headway in bringing key voice recognition models offline. And at a September hardware event where it launched 11 new and refreshed Alexa-powered products, the Seattle company showed off Hunches, which proactively recommends actions based on data from connected devices and sensors, and whisper mode, which responds to whispered speech with a quieter tone. (Whisper mode launched this month in the U.S.)