Rain. Smoke. Dirt. Debris-produced distortion would normally spell doom for a videographer, but researchers at global research and development firm Cambridge Consultants say they’ve harnessed artificial intelligence (AI) to reconstruct footage from damaged or obscured frames in real time. In one test on airfields and aviation stock video, it was able to accurately reproduce aircraft on a runway.
The AI system, dubbed DeepRay, will be fully detailed at the upcoming 2019 Consumer Electronics Show in January. It calls to mind Adobe’s distortion-correcting system for front-facing smartphone cameras, and an Nvidia technique that can “fix” corrupt images containing holes. But unlike most previous AI, DeepRay handles live video.
“Never before has a new technology enabled machines to interpret real-world scenes the way humans can — and DeepRay can potentially outperform the human eye,” Tim Ensor, commercial director for artificial intelligence at Cambridge Consultants, told VentureBeat in a phone interview. “The ability to construct a clear view of the world from … video, in the presence of continually changing distortion such as rain, mist, or smoke, is transformational.”
DeepRay — a product of Cambridge Consultants’ Digital Greenhouse internal incubator — leverages a machine learning architecture called a generative adversarial network (GAN) to effectively invent video scenes as it attempts to remove distortion. Broadly speaking, GANs are two-part neural networks consisting of generators that produce samples and discriminators that attempt to distinguish between the generated samples and real-world samples. In DeepRay’s case, a total of six networks — a team of generators and discriminators — compete against each other.
Research in GANs has advanced by leaps and bounds in recent years, particularly in the realm of machine vision. Google’s DeepMind subsidiary in October unveiled a GAN-based system that can create convincing photos of food, landscapes, portraits, and animals out of whole cloth. In September, Nvidia researchers developed an AI model that produces synthetic scans of brain cancer, and in August, a team at Carnegie Mellon demonstrated AI that could transfer a person’s recorded motion and facial expressions to a target subject in another photo or video. More recently, scientists at the University of Edinburgh’s Institute for Perception and Institute for Astronomy designed a GAN that can hallucinate galaxies — or high-resolution images of them, at least.
Ensor contends that only in the past two years has it been possible to train multi-network GANs at scale, in large part thanks to advances in purpose-built AI chips such as Google’s Tensor Processing Units (TPUs).
“We’re excited to be at the leading edge of developments in AI. DeepRay shows us making the leap from the art of the possible, to delivering breakthrough innovation with significant impact on our client’s businesses,” Ensor said. “This takes us into a new era of image sensing and will give flight to applications in many industries, including automotive, agritech and healthcare.”
At CES, the DeepRay team will demonstrate a neural network trained on Nvidia’s DGX-1 platform that, running on a standard gaming laptop, can remove distortion introduced by an opaque pane of glass. The dataset consists of 100,000 still images, but Ensor said that the team hasn’t characterized the system’s performance with larger sample sizes.
“As with all [AI models], it continues to improve with training,” he explained, “and it will degrade gracefully.”
The audio problem: Learn how new cloud-based API solutions are solving imperfect, frustrating audio in video conferences. Access here