Will AI power video chats of the future? That’s what Nvidia implied this week with the unveiling of Maxine, a platform that provides developers with a suite of GPU-accelerated AI conferencing software. Maxine brings AI effects including gaze correction, super-resolution, noise cancellation, face relighting, and more to end users, while in the process reducing how much bandwidth videoconferencing consumes. Quality-preserving compression is a welcome innovation at a time when videoconferencing is contributing to record bandwidth usage. But Maxine’s other, more cosmetic features raise uncomfortable questions about AI’s negative — and possibly prejudicial — impact.
A quick recap: Maxine employs AI models called generative adversarial networks (GANs) to modify faces in video feeds. Top-performing GANs can create realistic portraits of people who don’t exist, for instance, or snapshots of fictional apartment buildings. In Maxine’s case, they can enhance the lighting in a video feed and recomposite frames in real time.
Bias in computer vision algorithms is pervasive, with Zoom’s virtual backgrounds and Twitter’s automatic photo-cropping tool disfavoring people with darker skin. Nvidia hasn’t detailed the datasets or AI model training techniques it used to develop Maxine, but it’s not outside of the realm of possibility that the platform might not, for instance, manipulate Black faces as effectively as light-skinned faces.
Beyond the bias issue, there’s the fact that facial enhancement algorithms aren’t always mentally healthy. Studies by Boston Medical Center and others show that filters and photo editing can take a toll on people’s self-esteem and trigger disorders like body dysmorphia. In response, Google earlier this month said it would turn off by default its smartphones’ “beauty” filters that smooth out pimples, freckles, wrinkles, and other skin imperfections. “When you’re not aware that a camera or photo app has applied a filter, the photos can negatively impact mental wellbeing,” the company said in a statement. “These default filters can quietly set a beauty standard that some people compare themselves against.”
That’s not to mention how Maxine might be used to get around deepfake detection. Several of the platform’s features analyze the facial points of people on a call and then algorithmically reanimate the faces in the video on the other side, which could interfere with the ability of a system to identify whether a recording has been edited. Nvidia will presumably build in safeguards to prevent this — currently, Maxine is available to developers only in early access — but the potential for abuse was a question the company hasn’t so far addressed.
Nvidia provided this statement: “Our research team paid close attention to racial, gender, age, and cultural diversity while developing the AI features in the Nvidia Maxine platform for video conference applications. They curated about a thousand hours of video training data with representation across broad communities so that the technology will be usable by as many people as possible, from all backgrounds … Since Maxine is a modular platform, app developers can choose which features to include in their video conferencing applications. They can include — or not include — AI-enabled features like gaze and face alignment to help calls feel more like natural conversation, enabling people to look at the faces on their screens rather than their own web cameras.”
None of this is to suggest that Maxine is malicious by design. Gaze correction, face relighting, upscaling, and compression seem useful. But the issues Maxine raises point to a lack of consideration for the harms its technology might cause, a tech industry misstep so common it’s become a cliche. The best-case scenario is that Nvidia takes steps (if it hasn’t already) to minimize the ill effects that might arise. The fact that the company didn’t reserve airtime to spell out these steps at Maxine’s unveiling, however, doesn’t instill confidence.
Thanks for reading,
AI Staff Writer
VentureBeatVentureBeat's mission is to be a digital town square for technical decision-makers to gain knowledge about transformative technology and transact. Our site delivers essential information on data technologies and strategies to guide you as you lead your organizations. We invite you to become a member of our community, to access:
- up-to-date information on the subjects of interest to you
- our newsletters
- gated thought-leader content and discounted access to our prized events, such as Transform 2021: Learn More
- networking features, and more