Join Transform 2021 for the most important themes in enterprise AI & Data. Learn more.
Google Duplex, Google’s AI chat agent that can arrange appointments over the phone, will soon expand to more places — namely the web. Today at I/O 2019, Google announced Duplex on the web, which will handle things like rental car bookings and movie tickets.
“We want to build a more helpful Google for everyone,” said CEO Sundar Pichai onstage in Mountain View, California. “We’re going to be thoughtful [about this].”
When Duplex on the web debuts, you’ll be able to issue Google Assistant a command like “Book me a car from Hertz.” That command will navigate to the relevant web page and automatically fill in details like your name, car preferences, trip dates, payment information (using information from Gmail and Chrome autofill), and more.
Throughout the process, you’ll see a progress bar. And whenever Duplex needs more information, like a price or seat selection, it’ll pause and prompt you to make a selection. Once you’re finished, a tap of the confirmation button will beam a receipt to your inbox.
Duplex for web will launch later this year on Android phones.
Duplex: A brief history
Duplex over the phone, which Google first demoed at its I/O 2018 developers conference in May, first came to a “small group” of Pixel users in select cities last November before debuting on the Pixel 2, Pixel 3, and flagship smartphones like the Samsung Galaxy S10 earlier this year.
Booking appointments with a robot would be fraught with peril, you’d think, considering the number of things that might go wrong. What if the human on the other end has a thick accent that throws off Duplex’s speech recognition? What if they ask an obscure question the system hasn’t been trained to handle? Or what if the restaurant’s listed number is no longer in service?
Part of the reason Duplex sounds so natural is because it taps Google’s sophisticated WaveNet audio processing neural network, and because it intelligently inserts “speech disfluencies” — the “ums” and “ahs” people make involuntarily in the course of conversation. They come from a branch of linguistics known as pragmatics, which deals with language in use and the contexts in which it is used, including such things as taking turns in conversation, text organization, and presupposition.
Vice president of engineering for the Google Assistant Scott Huffman revealed in interviews this summer that the disfluencies turned out to be the key to advancing talks in tests of Duplex. Without them, he said, people were more likely to hang up as the exchange starts to feel overly artificial.
Google received a ton of criticism after its initial Duplex demo in May — many were not amused that Google Assistant mimicked a human so well. In June, the company promised that Google Assistant using Duplex would first introduce itself.
Of course, business owners don’t have to speak with the Google Assistant if they decide they’d rather not. At the beginning of exchanges, Duplex makes clear that the call is automated. In all states, it informs the person on the other end that they’re being recorded. If they respond with “I don’t want to be recorded” or some variation of the phrase, the call is handed off to a human operator on an unrecorded line. (Those operators also annotate the call transcripts used to train Duplex’s algorithms.)
When we tested Duplex late last year, Google said that although a majority of Duplex calls are made using the automated system, others are conducted with human operators. It declined to say whether that would be the case going forward.