If you’ve recently had your first interaction with a voice-based personal assistant like Amazon’s Alexa or Apple’s Siri, you might get the sense that artificial intelligence is just a few years away from being able to talk and act like a human. It will soon be capable of managing our schedules, troubleshooting technical issues, or even holding conversation.
According to a recent Wall Street Journal piece titled “Alexa and Cortana May Be Heading to the Office,” many businesses share that hope. One startup profiled in the piece uses “an Amazon Echo attached to the office ceiling for such tasks as adding events to their calendars,” while another is building a virtual assistant to set meetings on behalf of human users.
The belief that natural language processing is right around the corner seems to be widespread: About half of IT professionals in the Spiceworks survey cited in the article said they plan to use intelligent assistants in a corporate setting in the next three years.
This attitude reflects an admirable openness to new technology, but those companies are almost certainly going to be disappointed. The truth is, today’s voice-based personal assistants aren’t actually that intelligent. They’re basically rendering language into search keywords the same way Ask Jeeves did in the 1990s, albeit with low latency (no unnatural pauses) and a hands-free form factor. Speech recognition is becoming more accurate, but there’s a big difference between speech recognition and speech comprehension.
Human communication is an incredibly difficult skill to automate. Language comprehension relies on global context, social precedent, and other ambiguities. The researchers attempting to teach language to machine learning algorithms are focusing their efforts on more achievable domains, like speech-to-text transcription (still only about 60 percent accurate) and optical character recognition. True natural language understanding (NLU) is still decades away. For now, Alexa and friends offer glorified versions of the same interactive voice response (IVR) technology we’ve been shouting at in phone menus for years.
As the owner of an Amazon Echo, I can attest that these digital assistants are great consumer products. I worry, however, that we risk overestimating the underlying technology. Pinning our expectations for AI on the capacity for NLU and humanlike interaction sets us up for a long, demoralizing wait, during which time we may miss more promising opportunities to leverage machine learning.
Long before companies can rely on NLU, machine learning will play an enormous role in helping us eliminate the robotic tasks that still define many of our jobs. That’s where forward-looking companies should be focused in the near future: deepening the collaboration between human and machine intelligence, not attempting to model the complexities of language.
These aren’t the droids you’re looking for
Since the dawn of automation, our culture never could shake the notion that robots ought to think like we do. In the popular imagination, machine intelligence will only have arrived when it can replicate a specifically human kind of processing.
But human intelligence is powerful, dynamic, and poorly understood. For centuries we’ve assessed intelligence by someone’s ability to recall facts, or win at chess, or solve a Rubik’s cube. As it turns out, those activities simply required algorithmic processing within scoped domains — precisely the type of cognition that computers are good at. Those were the areas in which computers exceeded our capabilities first.
The skills gap is just as wide in domains where humans have the advantage. The ability to translate ambiguous language comes easily to the human brain. A 5-year-old has no problem inferring that the phrase “I saw Mount Rainier flying to Seattle” doesn’t mean that the mountain is zooming towards the city. Alexa, for all its ability to look up facts on command, still falls prey to sentences like that.
Even if it could achieve that basic level of language comprehension, there are many other layers of meaning and context the technology will miss. In 1973, Princeton anthropologist Clifford Geertz published a paper that described the ineffable way in which a wink, which is a simple movement of the eyelid, can communicate so many possible meanings. “A speck of behavior, a fleck of culture, and — voilà! — a gesture,” he wrote. Humans interpret these cues in ways that still confound machines.
Developing software that can speak naturally with humans isn’t just quixotic; it’s also redundant. Rather than fixate on imitating ourselves, we’re better off finding ways to let automation cover our weaknesses. Machine learning is ideal at performing data-heavy, repetitive tasks in a defined environment — in other words, the robotic work businesses need done and you don’t particularly enjoy. That’s where machine learning will revolutionize the office.
AI should empower, not imitate
Whether you’re prospecting sales leads, compiling a report, or handling customer service calls, every job function has some area machine learning can make more efficient. But since the computer can’t communicate with other people nearly as effectively as you can, that means it won’t actually call those leads. Or deliver your report to a room of investors, or be empathetic on the phone with a frustrated customer. All of that is still human work. Your work.
This is a good thing. As a consumer, I would rather speak with a customer support person who can understand the nuances of my problem than with a well-designed chatbot. Especially if the customer service person is being assisted by AI in the background. AI systems can even sense imminent problems before they arise, prompting the rep to call me before I even know there’s an issue.
Augmenting human job functionality in this way will be how AI benefits companies in the near term. Natural language understanding will continue to be an exciting area of focus in research labs and in PhD dissertations, but it’ll likely be the last thing we automate in a corporate setting. For the time being, there are other efficiencies to be found. Automation allows human knowledge workers to focus on the areas of cognition where we already perform best and, for the foreseeable future, will perform alone.
Ted Power is the cofounder of Abacus, a real-time expense system.