In what may be the biggest rollout of conversational AI from IBM since Watson, IBM Research today debuted Project Debater, an experimental conversational AI with a sense of humor, little tact, and occasionally powerful arguments.

Training of Project Debater began six years ago and only gained an ability to participate in debates with people two years ago, said IBM Research principal investigator and creator of Project Debater Noam Slonim. Debater’s smarts come from hundreds of millions of access to millions of journal and newspaper articles.

The AI system’s ability to deliver persuasive arguments was demonstrated for an audience of tech journalists at IBM offices in San Francisco where the AI system participated in debates about whether governments should subsidize space exploration and whether telemedicine should play a bigger role in health care.

When Project Debater gets a new topic, it searches its corpus of articles for sentences and clauses that are relevant to the topic, argumentative, and support its side of the debate. Then given all that content, it attempts to understand the main theme or clashes in a debate then organizes a narrative it delivers in a debate.

The debates were unscripted, a company spokesperson said, with the exception of jokes during debates and greetings delivered at the start of a debate. Project Debater competed against Dan Zafrir, president of the International Debate Society in Israel, and 2016 national Israel debate champion Noa Ovadia.

The debates followed a structure of a four-minute opening speech, a four-minute rebuttal, and two-minute argument summary. Between each portion, Project Debater paused to process the words of its opponent.

Unlike IBM’s Deep Blue playing Gary Kasparov in chess or DeepMind’s AlphaGo beating the top Go player in the world, debate is more nuanced than other competitions, but voting on debate performance by attendees to the event mirrored results found in the lab, which found that humans are often give a better speech while AI generally outperforms human ability to enrich the audience’s knowledge, said Project Debater manager Ranit Aharanov.

Debater however was able to win over nine members of the audience of about 40 in its argument in favor of telemedicine, effectively beating Dan Zafrir.

In its arguments, Project Debater was able to quote a range of sources from a Sheikh in the United Arab Emirates or the number of jobs space exploration makes according to a German minister of economic affairs, but Project Debater avoided directly quoting its opponents in debates in order to avoid making mistakes or its speech to text translation getting things wrong resulting in a misquote.

The AI system did manage to get some things wrong and made some wild assertions, like when it argued that space exploration was “more important than good roads, or improved schools, or better health care” or randomly saying “Scott Pelley voiceover” in the middle of an argument, seemingly referencing the CBS News and 60 Minutes correspondent.

Chris Reed is director of Center for Argument Technology, an academic group that explores conversational AI and is not a part of the project. At the invitation of IBM, Reed saw Project Debater perform for the first time Monday and said it was like watching “so many pieces of the puzzle coming together,” including an ability to stick to its own argument, a lack of grammatical errors, and being able to anticipate and rebutt its opponents arguments before they make them.

“Argument and debate, essentially it’s the engine that drives the process of science, characterizes what happens in most political forum and even frames most conceptions of modern religion,” he said. “Argumentation is one of the defining features of what it is to be human and if we can convey part of that then I think that means something very important is starting to change.”

A number of innovative techniques documented in dozens of research papers were developed to make Project Debater possible, Slomin said. Though it may take a fair amount of technology to prepare arguments and rebuttals or understand an opponent’s argument, much of Debater’s dialogue may still be drawn directly from articles.

“A lot of content that you see is actually phrases that are taken from the sources like newspapers, they do undergo rephrasing of various sorts to make them more coherent, to make them align with each other, to sometimes add information about the person mentioned or so on so there is phrasing but a lot of is taken as-is,” Aharanov said.

Following today’s performance, Project Debater will participate in an extended debate later this year possibly followed by a workshop where academics can offer their critiques of Debater’s performance, Slonim told VentureBeat in an interview.

More to come.