Yann LeCun and Yoshua Bengio: Self-supervised learning is the key to human-level intelligence

Self-supervised learning could lead to the creation of AI that's more humanlike in its reasoning, according to Turing Award winners Yoshua Bengio and Yann LeCun. Bengio, director at the Montreal Institute for Learning Algorithms, and LeCun, Facebook VP and chief AI scientist, spoke candidly about this and other research trends during a session at the International Conference on Learning Representation (ICLR) 2020, which took place online.

Supervised learning entails training an AI model on a labeled data set, and LeCun thinks it will play a diminishing role as self-supervised learning comes into wider use. Instead of relying on annotations, self-supervised learning algorithms generate labels from data by exposing relationships between the data's parts, a step believed to be critical to achieving human-level intelligence.

"Most of what we learn as humans and most of what animals learn is in a self-supervised mode, not a reinforcement mode. It's basically observing the world and interacting with it a little bit, mostly by observation in a test-independent way," said LeCun. "This is the type of [learning] that we don't know how to reproduce with machines."

Uncertainty is a major barrier standing in the way of self-supervised learning's success.

Distributions are tables of values -- they link every possible value of a variable to the probability the value could occur. They represent uncertainty perfectly well where the variables are discrete, which is why architectures like Google's BERT are so successful. Unfortunately, researchers haven't yet discovered a way to usefully represent distributions where the variables are continuous -- i.e., where they can be obtained only by measuring.

LeCun notes that one solution to the continuous distribution problem is energy-based models, which learn the mathematical elements of a data set and try to generate similar data sets. Historically, this form of generative modeling has been difficult to apply practically, but recent research suggests it can be adapted to scale across complex topologies.

For his part, Bengio believes AI has much to gain from the field of neuroscience, particularly its explorations of consciousness and conscious processing. (It goes both ways -- some neuroscientists are using convolutional neural networks, a type of AI algorithm well-suited to image classification, as a model of the visual system's ventral stream.) Bengio predicts that new studies will elucidate the way high-level semantic variables connect with how the brain processes information, including visual information. These variables are the kinds of things that humans communicate using language, and they could lead to an entirely new generation of deep learning models.

"There's a lot of progress that could be achieved by bringing together things like grounded language learning, where we're jointly trying to understand a model of the world and how high-level concepts are related to each other. This is a kind of joint distribution," said Bengio. "I believe that human conscious processing is exploiting assumptions about how the world might change, which can be conveniently implemented as a high-level representation. Those changes can be explained by interventions, or ... the explanation for what is changing -- what we can see for ourselves because we come up with a sentence that explains the change."

Another missing piece in the human-level intelligence puzzle is background knowledge. As LeCun explained, most humans can learn to drive a car in 30 hours because they've intuited a physical model about how the car behaves. By contrast, the reinforcement learning models deployed on today's autonomous cars started from zero -- they had to make thousands of mistakes before figuring out which decisions weren't harmful.

"Obviously, we need to be able to learn models of the world, and that's the whole reason for self-supervised learning -- running predictive models of the world that would allow systems to learn really quickly by using this model," said LeCun. "Conceptually, it's fairly simple -- except in uncertain environments where we can't predict entirely."

LeCun argues that even self-supervised learning and learnings from neurobiology won't be enough to achieve artificial general intelligence (AGI), or the hypothetical intelligence of a machine with the capacity to understand or learn from any task. That's because intelligence -- even human intelligence -- is very specialized, he says. "AGI does not exist -- there is no such thing as general intelligence," said LeCun. "We can talk about rat-level intelligence, cat-level intelligence, dog-level intelligence, or human-level intelligence, but not artificial general intelligence."

But Bengio believes that, eventually, machines will gain the ability to acquire all kinds of knowledge about the world without having to experience it, likely in the form of verbalizable knowledge.

"I think that's a big advantage for humans, for example, or with respect to other animals," he said. "Deep learning is scaling in a beautiful way, and that's one of its greatest strengths, but I think that culture is a huge reason why we're so intelligent and able to solve problems in the world ... For AI to be useful in the real world, we'll need to have machines that [don't] just translate, but that actually understand natural language."

More