Researchers at Facebook have taken a step closer to a holy grail of artificial intelligence known as unsupervised learning. They’ve come up with a way to generate samples of real photographs that don’t look all that fake.
In fact, the computer-generated samples — of scenes featuring planes, cars, birds, and other objects — looked real to volunteers who took a look at them 40 percent of the time, according to a new paper on the research posted online yesterday. Facebook has submitted the paper for consideration in the upcoming Neural Information Processing Systems (NIPS) conference in Montreal.
Supervised deep learning traditionally involves training artificial neural networks on a large pile of data that come with labels — for instance, “these 100 pictures show geese” — and then throwing them a new piece of data, like a picture of an ostrich, to receive an educated guess about whether the new picture depicts a goose.
With unsupervised learning, there are no labeled pictures to learn from. It’s sort of like the way people learn to identify things. Once you’ve seen one or two cell phones, you can recognize them immediately.
Facebook is pursuing unsupervised learning presumably to do a better (or more automated) job of some of the tasks where supervised learning can already be applied: image recognition, video recognition, natural language processing, and speech recognition. But should it move ahead further, whole new uses could be dreamed up.
For now, Facebook is simply conducting “pure research,” Facebook research scientist and lead author Rob Fergus told VentureBeat in an interview.
And that “pure research” is utterly fascinating. Google this week demonstrated that its neural networks can generate downright trippy representations of images — Fergus said they “look super cool” — but fundamentally that work “doesn’t get you any further in solving the unsupervised learning problem,” Fergus said. It’s much harder, he said, to generate images that look real that images that look psychedelic.
To do this, Facebook is using not one but two trained neural networks — one generative and one discriminative. You give the generative one a random vector, and it generates an image. The discriminative network decides whether the output image looks real.
The resulting system can produce tiny 64-by-64 pixel images.
“The resolution is enough to have a lot of complication to the scene,” Fergus said. “There’s quite a lot of subtlety and fidelity to them.”
Naturally, the researchers will be training the system to work with larger and larger images over time.