Ever heard of context modeling? It defines how contextual data is structured and maintained, and it plays a pivotal role in open domain conversation. That’s why researchers at Microsoft recently investigated a novel approach that involves rewriting the last utterance in a dialogue turn (i.e., a series of utterances) by considering context history. In a preprint paper detailing their work (“Unsupervised Context Rewriting for Open Domain Conversation“), they claim empirical results show it achieves state-of-the-art baselines in terms of rewriting quality and multi-turn response generation.
As the researchers explain, conversation context raises challenges not existing in the sentence modeling, including things like topic transitions, coreferences (e.g., he, him, she, it, they), and long-term dependencies. Most systems tackle these by appending keywords to the last turn utterance or by learning a numerical representation with AI models but often run into roadblocks, like an inability to select the right keyword or handle long context.
That’s where the team’s method comes in. It reformulates the last utterance in a dialogue by considering contextual information, with the goal of generating a self-contained utterance that neither has coreferences nor depends on other utterances in history. For instance, the exchange “I hate drinking coffee. Why? It’s tasty” becomes “Why hate drinking coffee? It’s tasty,” drawing on the word “it” (which refers to the coffee in context) and “Why?” (a shorter form of “Why hate drinking coffee?”).
The researchers architected a machine learning system — context rewriting network (CRN) — to automate the process end-to-end. It consists of a sequence-to-sequence model that maps fixed-length utterances to fixed-length rewritten sentences, with a separate attention mechanism that helps directly copy words from the context by focusing on different words in the last utterance.
The team trained the model first using pseudo data generated by inserting extracted keywords from context into the original last utterance. Then, to let the final response influence the rewriting process, they tapped reinforcement learning, an AI training technique that employs rewards to drive systems toward goals.
In a series of experiments, the team evaluated their method across several rewriting quality, multi-turn response generation, multi-turn response selection, and end-to-end retrieval-based tasks. They note that their model occasionally became unstable after reinforcement learning because it preferred to extract more words from context, but that it generally “significantly” improved the diversity of utterances.
The team believes their work takes a step toward more explainable and controllable context modeling, chiefly because the explicit context rewriting results are easier to debug and analyze. “The rewriting contexts are similar to human references,” they wrote. “Our model can extract important keywords from noisy context and insert them into the last utterance, [making it] not only easy to control and explain … but also [useful for] transmit[ting] … information directly to [the] last utterance.”