Amazon taps its SocialBot challenge to boost conversational AI

Earlier this week, Amazon announced the winners of its annual Alexa Prize SocialBot Grand Challenge, which promotes research into coherence, context awareness, fluency of response, and other areas fundamental to the future of conversational AI. Participating university teams design social bots for Alexa-enabled devices and can validate their ideas by directly engaging with Amazon's millions of Alexa customers.

But the competition isn't just a way for participants to experiment and earn research grants. Each research team maintains ownership of the intellectual property in its systems, and a win might mean an opportunity to integrate their research into Amazon's future plans. Some fairly significant advancements in conversational AI and subsequent scientific papers typically come out of the event.

This year's progress was largely in terms of duration. All three finalists achieved 20-minute conversations, which had only ever been achieved once during the first three editions of the challenge. Additionally, the average time achieved across social bots in the final rounds was 12 minutes and 42 seconds, more than double the average time from 2020's finals. One team also showed real progress in terms of quality of conversation, earning a perfect 5/5 rating.

"The average duration provides a consolidated indicator of how the scientific advances made by the Alexa Prize teams have driven great strides in the coherence and fluency of interactions," Prem Natarajan, VP of natural understanding for Alexa AI, told VentureBeat. He added that participants overall "have broadly influenced the field of conversational AI through the advances they've achieved," mentioning progress in neural response generation models, common sense knowledge representations, dialogue modeling/policies, and natural language understanding systems enhanced by large-scale Transformer models.

The winners

First place this year went to the team from Czech Technical University. The students' Alquist bot had an average score of 3.28 and average conversation duration of 14 minutes and 14 seconds, earning them $500,000. The team has made it to the finals in the past, but this marked its first win.

Jakub Konrád, who led the team, told VentureBeat the current version of Alquist leverages a combination of a flexible generative approach and high-quality dialogue scenarios. "This approach allows the bot to deliver an engaging conversation and act coherently to the unexpected inputs," he said. "Moreover, the generative model considers the external facts about the discussed entities to make the conversation more natural." Another important aspect, he noted, is that the bot remembers essential information about the user and then considers such information when constructing the subsequent conversation.

"With conversational AI making great strides in what is possible, we're reaching the point of having a free-flowing open-domain natural conversation …" Konrád said. "What we find most exciting is that that means we can continue to advance this technology and utilize it in real-world applications that are able to care for people and help them with their issues. Conversational AI has many usages, for example, in well-being, mental health, or education."

The Stanford University team this year took home second place -- and earned $100,000 -- for its Chirpy Cardinal bot. And third place went to the team from the University of Buffalo, which earned a $50,000 prize for its PROTO bot.

The future of conversational AI

Regarding what advancements seen in these challenges say about the progression of conversational AI overall, Natarajan notes Transformer-based deep learning technologies are "clearly making a difference in addressing hard conversational AI problems."

"Recent work in their potential for supporting compositionality indicates that we haven't tapped their full potential yet," he added. "Overall, we're pleased to see the university teams drive improvements in open-domain, multi-turn conversations. Their work provides another reason for optimism."

Natarajan believes the latest conversational technologies the company is starting to introduce, as well as those in development, will bring a new level of generalizability and autonomy to Alexa and the field of AI for both enterprises and consumers. "What this means is AI systems will become more self-aware, more self-learning, and enable more self-service by both developers and end users," he said.

Alexa has come a long way since it was first introduced with the Amazon Echo in 2014. The company introduced new features like Alexa Answers and has seen growing adoption among enterprises. In a recent survey of 500 IT and business decision-makers, 28% of respondents said they were using voice technologies and 84% expect to be using them in the next year. But the voice technology hasn't been without issue. VentureBeat recently discussed persisting problems around toxicity, privacy, and bias with Rohit Prasad, head scientist at Amazon's Alexa division, who said the team is hoping to inject greater trust into its systems.