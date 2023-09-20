“Alexa, let’s chat” — nearly a decade after debuting its voice-activated Echo devies, the e-commerce and cloud juggernaut Amazon today announced that its signature voice assistant Alexa is being upgraded with a new, custom-built large language model (LLM), taking advantage of the generative AI boom in Silicon Valley to give Alexa even more capabilities and human-like conversational qualities.

The news was delivered by David Limp, Amazon’s senior vice president (SVP) of devices and services at the company’s lavish “HQ2” headquarters in Virginia, outside of Washington D.C. with VentureBeat among those in attendance.

The new Alexa LLM will be available as a free preview on Alexa-powered devices in the U.S. soon, and Amazon claims it is “smarter, more conversational” and its voice is more “realistic” and “casual.”

According to another speaker at the event, Rohit Prasad, Amazon’s SVP and head scientist of artificial general intelligence, the news marks a “massive transformation of the assistant we love.”

Amazon hoping to leapfrog OpenAI’s ChatGPT with more ‘real world’ capabilities

While Amazon’s entry in the conversational LLM space comes almost a year after OpenAI shocked the world with the power of its ChatGPT application and turned into a household name overnight, the company claims that the new Amazon LLM was worth the wait.

Amazon says unlike ChatGPT, whose knowledge base stops in late 2021 or early 2022, the Alexa LLM offers “real-time info” and is “more conversational” and has “less latency” than previous versions of Alexa.

Amazon called out ChatGPT by name during the event, saying its Alexa LLM “goes beyond ChatGPT in the browser or mobile,” by offering “real-world applications,” to users, such as conversing with them about recipes, travel ideas, and writing poems for them.

“What makes our LLM special doesn’t just tell you things, it does things,” Prasad said.

As if to illustrate this idea, Limp also performed a live demo in front of the crowd of assembled press and Amazon employees, asking how his “favorite football” team was doing, and Alexa remembered he was referring to Vanderbilt University, showing off its personalized features. It also would respond in a “joyful” voice if the user’s preferred team won.

Limp also asked Alexa to write a message to his friends to remind them to watch the upcoming Vanderbilt football game and send it to his phone, and the assistant performed the action within a few seconds.

Amazon showed a promotional video where it suggested that the new Alexa LLM was “part of the family” for users.

Four key components and third-party apps

Prasad said that the new Alexa LLM was built around four key components: large language models, real-world devices and services, personal context and responsible AI.

In fact, another presenter, Heather Zorn, Amazon’s vice president of Alexa connections and essentials, said that developers can and have already integrated some their own “custom, purpose-built” third-party LLMs into Alexa.

One developer that already has is the popular Character.AI startup, which lets users create and interact with different fictional characters and archetypes and offers 25 different personality types.

Another developer, Splash, offers users the ability to create and preview songs through the Alexa integration of its app.

Impressive tech under the hood

Prasad said Alexa’s text-to-speech engine is now “more context-aware of emotions and tone-of-voice and then expressing similar emotional variation in the output” to what the speaker’s tone-of-voice is.

It also includes a new automatic speech recognition system designed for conversations “by taking what is best in class and making it even better” and “uses a massive transformer model.”

For Amazon Echo Show devices with built-in screen and video camera, Amazon users enrolled in visual ID simply need to look at their device to talk to it — they no longer have to say Alexa over and over. They can carry on a conversation with the assistant as they would when looking at another person.

This is thanks to “on-device visual processing and acoustic models working in concert, so it knows whether you are addressing Alexa or someone else in the room,” according to Prasad.

This is a developing news story…check back in a few minutes for more updates and information.