Facebook wants help improving its computer vision smarts, and to do so it’s opening more of its in-house tools to the developer community, as we reported yesterday.
Computer vision is an arm of artificial intelligence (A.I.) that helps machines understand images by breaking them down and processing them on a pixel-by-pixel basis, rather than via manually input meta data, such as keywords or descriptions.
Photos, videos, and general imagery are an integral part of Facebook’s raison d’être, and enabling computers to identify objects contained within images can be hugely beneficial for scaling the content classification of imagery. To the human eye, it may be obvious that a video or photo contains three humans, a fridge, and 6 bottles of beer, but that’s not so easy for a machine to establish off its own volition — and when you throw countless configurations of lighting and colors into the mix, it’s clear that humans still have the upper hand in terms of understanding images.
But computer vision technology has improved by leaps and bounds, and computers are getting pretty good at recognizing what’s in a photo and where the objects are within an image. Detecting objects contained within images is one thing, but segmenting them so that the overlapping edges of objects don’t confuse a machine into lumping part of an animal in with a human it’s standing in front of is more complicated. And it’s here that Facebook’s researchers are pushing the boundaries.
An exclusive invite-only evening of insights and networking, designed for senior enterprise executives overseeing data stacks and strategies.
Now Facebook is putting the code for the algorithms that are powering its computer vision push on GitHub. The DeepMask segmentation framework, alongside the SharpMask segment refinement module, are accessible to anyone to contribute to and will go some way toward helping Facebook improve the technology. Also now on GitHub is MultiPathNet, which Facebook calls a “specialized convolutional net” that labels each object in an image.
“We’re making the code …. open and accessible to all, with the hope that they’ll help rapidly advance the field of machine vision,” explained Piotr Dollar, research scientist at Facebook AI Research (FAIR), in a blog post. “As we continue improving these core technologies we’ll continue publishing our latest results and updating the open-source tools we make available to the community.”
More and more companies are shifting into the machine learning realm to build better automated technologies for people. Predictive typing keyboard company SwiftKey, which was recently acquired by Microsoft, is working on a sophisticated back-end built around A.I. This includes artificial neural networks (ANNs) that are more directly based on the structure and workings of the human brain. And stock photo giant Shutterstock built its own convolutional neural network to power its new reverse image search technology.
By handing its software over to the developer community, Facebook can achieve better results more quickly. Indeed, it’s no stranger to open-sourcing its in-house technology. In the last few months alone, the social network giant has open-sourced Torchnet to accelerate A.I. research, and an SDK for embedding 360 photos and videos into apps. The company actually has more than 200 projects on GitHub, and James Pearce, the head of open source, recently explained why Facebook embraces the open-source community. It comes down to ideology, innovation, and the fact that it’s generally good for business. “Our goal at Facebook is to open-source as much of our technology as possible, in particular the technology we feel would be valuable for the broader engineering community at large,” he said at the time.
VentureBeat's mission is to be a digital town square for technical decision-makers to gain knowledge about transformative enterprise technology and transact. Discover our Briefings.