Google open-sources ActiveQA, an AI agent that learns to ask good questions

Artificially intelligent (AI) systems aren't naturally good at asking questions; they have to be taught. It's a core area of focus for Google, which taps natural language processing and other conversational AI techniques to make interactions with the Google Assistant as natural as they can be.

Today, it's open-sourcing Active Question Answering (ActiveQA), a research project that investigates the use of reinforcement learning to train AI agents for question answering.

Michelle Chen Huebscher, a software engineer at Google AI, Google's eponymous AI division, describes an ActiveQA as an "agent that repeatedly [...] interacts with QA systems using natural language with the goal of providing better answers." Like an incessant kid, it repeats questions ("When was Tesla born?") in new forms ("Which year was Tesla born") and with novel phrasing ("When is Tesla's birthday"), with the ultimate goal of obtaining better answers.

"[The] agent ... sits between the user and a black box QA system and learns to reformulate questions to elicit the best possible answers," Google researchers wrote in a paper ("Ask the Right Questions: Active Question Reformulation with Reinforcement Learning") published during the Seventh International Conference on Machine Learning in May. "The agent probes the system with, potentially many, natural language reformulations of an initial question and aggregates the returned evidence to yield the best answer."

Over time, ActiveQA -- spurred on by a reinforcement learning framework -- learns to ask more pointed and specific questions that lead to the results it's seeking. Each question posed to the QA system is evaluated in terms of how well it corresponds to the answer, and responses both good and bad result in adjustments to the model's parameters.

Google is making ActiveQA available in the form of a package for TensorFlow, its machine learning framework. In addition to an answer selection model -- a convolutional neural network trained using publicly available word embeddings from Stanford's GloVe dataset -- and a question-answering system based on Stanford's BiDAF (Bi-Directional Attention Flow for Machine Comprehension), the search giant is supplying a pretrained sequence-to-sequence system adapted from the TensorFlow Neural Machine Translation Tutorial Code.

In the aforementioned paper, the Google team demonstrated that ActiveQA could outperform the underlying QA system supplying the answers to its questions -- in that case, a dataset extracted from Jeopardy!

"We envision that this research will help us design systems that provide better and more interpretable answers," Huebscher and Rodrigo Nogueira, a Ph.D. student and software engineering intern at Google AI, wrote in their blog post. "Google's mission is to organize the world's information and make it universally accessible and useful, and we believe that ActiveQA is an important step in realizing that mission."

More