Google roasts Apple on computational photography: 'It's not mad science'

Even though I'm a long-term iPhone user, I pay attention every year when Google holds its annual Pixel phone event, as that's one of two opportunities Android phone makers have to convince me to make the big switch -- Samsung's Galaxy Unpacked event for S-series phones is the other. While Samsung typically pitches cutting-edge hardware -- beautiful screens, fast wireless features, and new cameras -- Google takes a different tack. "The hardware isn't what makes our camera so much better," Sabrina Ellis from the Pixel team said today in introducing Pixel 4. "The special sauce that makes our Pixel camera unique is our computational photography."

Computational photography has certainly become a major selling point for Pixel phones. The one Pixel 3 feature that made me jealous last year was Night Sight, a Google-developed machine learning trick that instantly restores brightness and color to dimly lit photos. There was another, less widely appreciated Pixel 3 feature called Super Res Zoom that uses multiple exposures to replace mediocre "digital zoom" performance. In short, high-speed mobile processors and cameras are fundamentally redefining daily photography, and every year, it seems like Google is leading the way.

When Apple marketing chief Phil Schiller got up on stage last month to discuss Deep Fusion, a new iPhone neural engine trick to extract additional detail from nine exposures, he referred to it as "computational photography mad science," eliciting laughter and applause from the audience. But during today's Made by Google '19 event, Google researcher and Stanford professor emeritus Marc Levoy fired an interesting shot back at the marketer: "This isn't 'mad science' -- it's just simple physics."

On one hand, I can understand where Professor Levoy is coming from: When you've spent years developing brilliant Google computational photography tricks such as single-lens portrait mode, synthetic fill-flash, Night Sight, and Super Res Zoom, being described as a mad scientist -- even jokingly -- by a fast-following competitor might feel somewhat disrespectful. I can also understand the desire to respond, particularly with a pithy reference to the supposedly basic science underlying the innovations. It was a great quote, got my attention, and shaded Apple, so ... mission accomplished?

But on the other hand, the finest details of these innovations are moving well beyond the comprehension of average people, arguably to the point where smartphone launch events such as Google's and Apple's might be best served next year with separate post-keynote camera spotlights. Adding a second lens is a lot easier to explain than an AI technique that extracts more detail from a single lens. I think Schiller's glib reference to "mad science" was shorthand for "innovative in ways that are as hard to explain as they are to dream up," and in fact, most observers left Apple's event with little to no idea of how Deep Fusion worked or what it did, outside of "adding more detail."

Interestingly, that's the very crux of computational photography at this point. The "simple physics" of combining multiple exposures to extract additional detail has become not only a viable way to generate better photographs, but also a strong enough selling point to rely upon for annual smartphone updates, and all but kill sales of basic standalone cameras. As I noted in an earlier article, Apple's Deep Fusion does for detail what multi-exposure high dynamic range (HDR) photographs did for brightness and color, using math and machine learning to determine and retain only the best pixels from multiple images. Google's latest additions, such as a long-exposure astrophotography mode, Live HDR+ previewing, and dual-exposure captures are all using software to rival if not exceed features in even the latest and most expensive DSLRs.

Machine learning, on-device neural engines, and overall improvements in component performance have really come together to revolutionize pocket photography. The Pixel team might downplay the comparative importance of hardware in its camera solutions, and it's true that Pixels don't have as many lenses as iPhone 11 Pro models, or as many megapixels as many other Android rivals. But it's the complete package of millisecond-level camera sensor and image processor responsiveness, high-performance neural analysis, and creative, well-trained photo software that enable average people to just "point and shoot" their way to stunningly clear, detailed images. With their phones.

So here's to the mad scientists who have made these sorts of innovations possible. It may just be simple physics to you as developers, but for those of us who increasingly rely upon your computational photography techniques in our cameras, the results are indeed increasingly seeming closer to magic than science. And I can't wait to see what new tricks you'll have up your sleeves at the next show.

More