In what may be the biggest rollout of conversational AI from IBM since Watson, IBM Research today debuted Project Debater, an experimental conversational AI with a sense of humor, little tact, and occasionally powerful arguments.
Training for Project Debater began six years ago, but the AI system only gained the ability to participate in debates with people two years ago, said Noam Slonim, IBM Research principal investigator and creator of Project Debater. The AI’s smarts come from hundreds of millions of interactions with millions of journal and newspaper articles.
Project Debator’s ability to deliver persuasive arguments was demonstrated for an audience of tech journalists gathered at IBM offices in San Francisco, where the AI participated in debates about whether governments should subsidize space exploration and whether telemedicine should play a bigger role in health care.
When Project Debater gets a new topic, it searches its corpus of articles for sentences and clauses that are relevant, argumentative, and support its side of the debate. Parsing that content, it attempts to understand the main themes or clashes in a debate and then organize the narrative that it will deliver.
The debates were unscripted, a company spokesperson said, with the exception of jokes during the debates and greetings delivered at the start of a debate. Project Debater competed against Dan Zafrir, president of the International Debate Society in Israel, and 2016 national Israeli debate champion Noa Ovadia.
The debates followed the structure of a four-minute opening speech, a four-minute rebuttal, and a two-minute argument summary. Between each portion, Project Debater paused to process the words of its opponent.
Unlike IBM’s Deep Blue playing Gary Kasparov in chess or DeepMind’s AlphaGo beating the top Go player in the world, debate is a more nuanced form of competition. But voting by attendees at the event mirrored results found in the lab, which determined that humans often give better speeches, while AI generally outperforms human ability to enrich the audience’s knowledge, said Project Debater manager Ranit Aharanov.
Debater was able to win over nine members of the audience of about 40 with its argument in favor of telemedicine, effectively beating Dan Zafrir.
In its arguments, Project Debater was able to quote a range of sources — from a Sheikh in the United Arab Emirates to the number of jobs space exploration makes — according to a German minister of economic affairs. But Project Debater avoided directly quoting its opponents in order to avoid making mistakes or getting things wrong in its speech to text translation, resulting in a misquote.
In spite of those precautions, the AI system did get some things wrong and made some wild assertions, like when it argued that space exploration was “more important than good roads, or improved schools, or better health care” or when it randomly said “Scott Pelley voiceover” in the middle of an argument, seemingly referencing the CBS News and 60 Minutes correspondent.
Chris Reed is director of the Center for Argument Technology, an academic group that explores conversational AI and is not a part of the project. At the invitation of IBM, Reed saw Project Debater perform for the first time Monday and said it was like watching “so many pieces of the puzzle coming together,” citing the AI’s ability to stick to its own argument, the lack of grammatical errors, and ability to anticipate and rebut opponents’ arguments before they made them.
“Argument and debate — essentially, it’s the engine that drives the process of science, characterizes what happens in most political forums, and even frames most conceptions of modern religion,” he said. “Argumentation is one of the defining features of what it is to be human, and if we can convey part of that then I think that means something very important is starting to change.”
A number of innovative techniques documented in dozens of research papers were developed to make Project Debater possible, Slonim said. Though it may take a fair amount of technology to prepare arguments and rebuttals or understand an opponent’s argument, much of Debater’s dialogue may still be drawn directly from articles.
“A lot of content that you see is actually phrases that are taken from the sources, like newspapers. They do undergo rephrasing of various sorts to make them more coherent, to make them align with each other, to sometimes add information about the person mentioned, or so on, so there is phrasing — but a lot of is taken as is,” Aharanov said.
Following today’s performance, Project Debater will participate in an extended debate later this year, possibly followed by a workshop in which academics can critique Debater’s performance, Slonim told VentureBeat in an interview.