Yoshua Bengio: Attention is a core ingredient of 'conscious' AI

During the International Conference on Learning Representations (ICLR) 2020 this week, which as a result of the pandemic took place virtually, Turing Award winner and director of the Montreal Institute for Learning Algorithms Yoshua Bengio provided a glimpse into the future of AI and machine learning techniques. He spoke in February at the AAAI Conference on Artificial Intelligence 2020 in New York alongside fellow Turing Award recipients Geoffrey Hinton and Yann LeCun. But in a lecture published Monday, Bengio expounded upon some of his earlier themes.

One of those was attention -- in this context, the mechanism by which a person (or algorithm) focuses on a single element or a few elements at a time. It's central both to machine learning model architectures like Google's Transformer and to the bottleneck neuroscientific theory of consciousness, which suggests that people have limited attention resources, so information is distilled down in the brain to only its salient bits. Models with attention have already achieved state-of-the-art results in domains like natural language processing, and they could form the foundation of enterprise AI that assists employees in a range of cognitively demanding tasks.

Bengio described the cognitive systems proposed by Israeli-American psychologist and economist Daniel Kahneman in his seminal book Thinking, Fast and Slow. The first type is unconscious -- it's intuitive and fast, non-linguistic and habitual, and it deals only with implicit types of knowledge. The second is conscious -- it's linguistic and algorithmic, and it incorporates reasoning and planning, as well as explicit forms of knowledge. An interesting property of the conscious system is that it allows the manipulation of semantic concepts that can be recombined in novel situations, which Bengio noted is a desirable property in AI and machine learning algorithms.

Current machine learning approaches have yet to move beyond the unconscious to the fully conscious, but Bengio believes this transition is well within the realm of possibility. He pointed out that neuroscience research has revealed that the semantic variables involved in conscious thought are often causal -- they involve things like intentions or controllable objects. It's also now understood that a mapping between semantic variables and thoughts exists -- like the relationship between words and sentences, for example -- and that concepts can be recombined to form new and unfamiliar concepts.

Attention is one of the core ingredients in this process, Bengio explained.

Building on this, in a recent paper he and colleagues proposed recurrent independent mechanisms (RIMs), a new model architecture in which multiple groups of cells operate independently, communicating only sparingly through attention. They showed that this leads to specialization among the RIMs, which in turn allows for improved generalization on tasks where some factors of variation differ between training and evaluation.

"This allows an agent to adapt faster to changes in a distribution or ... inference in order to discover reasons why the change happened," said Bengio.

He outlined a few of the outstanding challenges on the road to conscious systems, including identifying ways to teach models to meta-learn (or understand causal relations embodied in data) and tightening the integration between machine learning and reinforcement learning. But he's confident that the interplay between biological and AI research will eventually unlock the key to machines that can reason like humans -- and even express emotions.

"Consciousness has been studied in neuroscience ... with a lot of progress in the last couple of decades. I think it's time for machine learning to consider these advances and incorporate them into machine learning models."

More