Can’t get your Amazon Echo or Google Home smart speaker to understand you? It might be your accent. Two teams of researchers recruited by the Washington Post found that Amazon’s Alexa assistant and the Google Assistant exhibit a pattern of poor performance with certain dialects.
The teams tested thousands of voice commands dictated by more than 100 people across 20 cities, and the results were conclusive: Google’s and Amazon’s speakers were 30 percent less likely to understand non-American accents than those of native-born users, and the overall accuracy rate for Chinese, Indian, and Spanish accents was about 80 percent. (The researchers stuck to Alexa and the Google Assistant, opting not to test other voice assistants, like Apple’s Siri or Microsoft’s Cortana.)
This discrepancy in accuracy points to clear evidence of bias in the data used to train the two voice recognition systems, Rachael Tatman, a Kaggle data scientist with expertise in speech recognition, told the Washington Post.
“These [voice assistants] are going to work best for white, highly educated, upper-middle-class Americans, probably from the West Coast, because that’s the group that’s had access to the technology from the very beginning,” she said.
There’s evidence to suggest this might be true.
People who spoke Spanish as a first language were misinterpreted 6 percent more often than people from along the West Coast, the results showed. Google Home speakers were 3 percent less likely to give accurate responses to people with Southern accents than those with Western accents. And Amazon’s Echo devices performed 2 percent worse with Midwest inflections.
Globalme, a Vancouver-based language localization firm that contributed to the studies, ran Alexa and the Google Assistant through a gauntlet of 70 preset commands, like “Add a new appointment” and “How close am I to the nearest Walmart?” It found that Amazon’s assistant tended to do better with Southern and Eastern accents, while Google’s had an easier time understanding people from the West and Midwest.
An Amazon spokesperson told the Washington Post that Alexa’s voice recognition is constantly improving over time, as more users speak to it with various accents. And Google in a statement pledged to “continue to improve speech recognition for the Google Assistant as we expand our datasets.”
The findings aren’t exactly earth-shattering — linguistic differences in pronunciation have stumped algorithms for years. (A recent study found that YouTube’s automatic captioning did worse with Scottish speakers than American Southerners.) But the results put into sharp relief the challenges smart speaker OEMs, which have sold tens of millions of units collectively, have yet to overcome.
New studies may hold the key. Just this week, researchers at Cisco, the Moscow Institute of Physics and Technology, and the Higher School of Economics proposed a system that leveraged dialectical difference in diction and intonation to create new accented samples of words, which it learned to recognize accurately.
The audio problem: Learn how new cloud-based API solutions are solving imperfect, frustrating audio in video conferences. Access here