We are excited to bring Transform 2022 back in-person July 19 and virtually July 20 - 28. Join AI and data leaders for insightful talks and exciting networking opportunities. Register today!
One year ago, when Google’s Duplex emerged as a conversational AI service that makes phone calls on your behalf, much was made of its ability to talk like a human, an act too deep in the uncanny valley for some. But that’s a shortsighted assessment.
The emergence of Duplex was the opening salvo in Google’s broadening voice strategy, which goes beyond powering a world-class AI assistant to offer consumers the ability to interact with businesses via an automated bot, while simultaneously offering businesses conversational services to help them interact with customers.
Since the emergence of Duplex, Google Cloud launched Contact Center AI to manage customer service centers and to augment conversations with customers. The voice strategy continued with aggressive international expansion, starting with the ability to read virtually any text on Android Go, a lighter version of the Android operating system. Voice is also being used to fill in text fields for people with classic T9 phones. Google Assistant also picked up new languages, and is now available in more than 30 languages and 80 countries around the world.
Growth is high in some parts of the world outside the United States. Earlier this year at Mobile World Congress, Google said Google Assistant use has seen a 7x increase in active users in places like India, Indonesia, Mexico, and Brazil this past year.
Conversations for people and businesses
To help people get info about things like movies and restaurants, in February Google introduced Google Assistant Suggestions for Android Messages, a service that will recommend actions based on the words used in text conversations. Last month, Google Voice was made available for all G Suite customers for phone calls; it can create customer greetings, from simple greetings to branching choices for callers.
Then last month, Google introduced Calljoy to help small businesses harness language models to automate management of incoming calls. Calljoy is being labeled an experiment from Google’s Area 120 at launch, but Duplex was called an experiment at launch too.
Each of these offerings improve upon existing features like the Actions on Google platform for conversational actions and Dialogflow, Google’s NLP engine for conversational bots. At I/O this year, the company accelerated its voice strategy with on-device machine learning to make Google Assistant performance up to 10 times faster and to power initiatives like Duplex for the web.
The latest iteration of Duplex fills in information about you to make completing purchases quicker and easier to do. Driving Mode with Google Assistant in the car was also introduced, following the addition of Google Assistant to Google Maps earlier this year and to Android Auto last year.
Google also showcased efforts to reach new frontiers in conversational AI and grow speech recognition for people with disabilities, including Live Relay, which uses AI to help people who are deaf or hard of hearing carry out voice phone calls.
The Alphabet subsidiary also showed off new ways to use Google Assistant’s computer vision tool Lens, including live translation of text in more than 100 languages. There’s also the new Nest Hub Max, a device akin to a smart speaker with a built-in screen that marries voice with visual experiences. The Nest Hub Max uses facial recognition software to personalize what appears on the screen. A combination of voice authentication and facial recognition biometrics could someday form the base for seamless, friction-free payments.
Each of these features, products, or services rolled out in the past year funnel into a single strategy: connecting businesses and customers.
Voice chat focuses on use cases
Duplex on the web’s chosen use case this week is Hertz rental cars, and flight check-in via Google Assistant will begin with United Airlines. Duplex isn’t a general purpose tool, however — it’s only for calling businesses such as restaurants and hair salons.
This is a classic conversational commerce use case, one Facebook has stuck to since its Messenger Platform emerged in 2016, including for WhatsApp Business. But Google’s approach appears more versatile, capable of both voice and text interaction.
Here we see a company propelling its voice-first strategy by being both comprehensive and ubiquitous. Ubiquity is a word regularly used by the likes of Alexa, Siri, Bixby, and other assistants to describe an ambition to be available in the home, car, and workplace, but it’s most true with Google Assistant, which is now available on more than a billion devices.
All of this adds up to one fact: By some measure, Google may have already won the chat wars, the competition between tech giants to convince the world to adopt their AI assistant on devices like smart speakers, televisions, smartphones, and cars.
In smart speakers, Amazon continues to lead global sales with 13.7 million speakers sold in Q4 2018, according to a Strategy Analytics report released in February. However, Google isn’t far behind, with 11.5 million units shipped, and in the past year Amazon’s global market share has slid from near 80% to about even with Google at 30%.
Canalys and Strategy Analytics even found that in Q2 2018 Home Mini speakers outsold all other smart speakers worldwide.
Despite the growing popularity of smart speakers and smart displays, Google’s dominance is based on the fact that the smartphone is still the way people are most likely to interact with an intelligent assistant.
This is reflected by a Microsoft survey released last month that found Siri and Google Assistant are the most popular AI assistants: 36% of respondents said they have used each assistant respectively, followed by Alexa (25%) and Cortana (19%).
Android is now on 2.5 billion monthly active devices, making it far and away the most popular mobile operating system on Earth. Chrome, which also supports Google voice control, is the world’s most popular web browser, accounting for the majority of global web traffic. Then there’s Google’s search monopoly.
But that’s consumer adoption. The past year symbolizes a shift to bring on more small business and enterprise customers, and one of the biggest contenders in the way of Google’s attempt to convince enterprise customers to adopt their solutions is Microsoft and Cortana.
Don’t sleep on Microsoft
Despite Microsoft’s efforts, Cortana is yet to be incorporated into a range of appliances like Amazon’s Alexa and Google Assistant are, and it has failed to attract any major growth in consumer usage or manufacturer adoption for Cortana.
This week at the Build conference in Seattle, Microsoft CEO Satya Nadella emphasized the need for a multi-assistant world. That’s a concept routinely mentioned by Amazon, Facebook, and Microsoft, but never Google or Apple.
Microsoft also showcased advances in its conversational AI — which achieved human parity years ago. Microsoft Bot Framework and assistants like Cortana are going to be able to handle more multi-turn dialogue thanks to Semantic Machines, a startup the company acquired last year.
Microsoft also urged enterprise customers to create their own Cortana-like assistants to make their employees more efficient.
The distinction between Microsoft and Google in this scenario is that Microsoft is pitching to existing customers, which includes the majority of the Fortune 500. Google, on the other hand, is a company with undeniable advantages and a vision of using voice, the simplest user interface, to advance its business interests with enterprise customers. This effort is helped along by virtual monopolies on three tools fundamental to modern life: the search engine, the smartphone, and the web browser.
Facebook still owns some of the most-used chat apps on Earth, Microsoft still has relationships with many of the world’s enterprise customers, and Amazon’s Alexa still has the most popular smart speaker in the United States. But Google’s voice strategy remains more comprehensive, understandable, and compelling than the rest.