Study warns deepfakes can fool facial recognition

Deepfakes, or AI-generated videos that take a person in an existing video and replace them with someone else's likeness, are multiplying at an accelerating rate. According to startup Deeptrace, the number of deepfakes on the web increased 330% from October 2019 to June 2020, reaching over 50,000 at their peak. That's troubling not only because these fakes might be used to sway opinion during an election or implicate a person in a crime, but because they've already been abused to generate pornographic material of actors and defraud a major energy producer.

Open source tools make it possible for anyone with images of a victim to create a convincing deepfake, and a new study suggests that deepfake-generating techniques have reached the point where they can reliably fool commercial facial recognition services. In a paper published on the preprint server Arxiv.org, researchers at Sungkyunkwan University in Suwon, South Korea demonstrate that APIs from Microsoft and Amazon can be fooled with commonly used deepfake-generating methods. In one case, one of the APIs -- Microsoft's Azure Cognitive Services -- was fooled by up to 78% of the deepfakes the coauthors fed it.

"From experiments, we find that some deepfake generation methods are of greater threat to recognition systems than others and that each system reacts to deepfake impersonation attacks differently," the researchers wrote. "We believe our research findings can shed light on better designing robust web-based APIs, as well as appropriate defense mechanisms, which are urgently needed to fight against malicious use of deepfakes."

The researchers chose to benchmark facial recognition APIs from Microsoft and Amazon because both companies offer services to recognize celebrity faces. The APIs return a face similarity scoring metric that makes it possible to compare their performance. And because celebrity face images are plentiful compared with those of the average person, the researchers were able to generate deepfakes from them relatively easily. Google offers celebrity recognition via its Cloud Vision API, but the researchers say the company denied their formal request to use it.

To see the extent to which commercial facial recognition APIs could be fooled by deepfakes, the researchers used AI models trained on five different datasets -- three publically available and two that they created themselves -- containing the faces of Hollywood movie stars, singers, athletes, and politicians. They created 8,119 deepfakes from the datasets in total. Then they extracted faces in the deepfakes' video frames and had the services attempt to predict which celebrity was pictured.

The researchers found that all of the APIs were susceptible to being fooled by the deepfakes. Azure Cognitive Services mistook a deepfake for a target celebrity 78% of the time, while Amazon's Rekognition mistook it 68.7% of the time. Rekognition misclassified deepfakes of a celebrity as another real celebrity 40% of the time and gave 902 out of of 3,200 deepfakes higher confidence scores than the same celebrity's real image. And in an experiment with Azure Cognitive Services, the researchers successfully impersonated 94 out of 100 celebrities in one of the open source datasets.

The coauthors attribute the high success rate of their attacks to the fact that deepfakes tend to preserve the same identity as the target video. As a result, when the Microsoft and Amazon services made mistakes, they tended to do so with high confidence, with Amazon's exhibiting a "considerably" higher susceptibility to being fooled by deepfakes.

"Assuming the underlying face recognition API cannot distinguish the deepfake impersonator from the genuine user, it can cause many privacy, security, and repudiation risks, as well as numerous fraud cases," the researchers warn. "Voice and video deepfake technologies can be combined to create multimodal deepfakes and used to carry out more powerful and realistic phishing attacks ... [And] if the commercial APIs fail to filter the deepfakes on social media, it will allow the propagation of false information and harm innocent individuals."

Microsoft and Amazon declined to comment.

The study's findings show that the fight against deepfakes is likely to remain challenging, especially as media generation techniques continue to improve. Just this week, deepfake footage of Tom Cruise posted to an unverified TikTok account racked up 11 million views on the app and millions more on other platforms. And when scanned through several of the best publicly available deepfake detection tools, they avoided discovery, according to Vice.

In an attempt to fight the spread of deepfakes, Facebook -- along with Amazon and Microsoft, among others -- spearheaded the Deepfake Detection Challenge, which ended last June. The challenge's launch came after the release of a large corpus of visual deepfakes produced in collaboration with Jigsaw, Google's internal technology incubator, which was incorporated into a benchmark made freely available to researchers for synthetic video detection system development.

More recently, Microsoft launched its own deepfake-combating solution in Video Authenticator, a tool that can analyze a still photo or video to provide a score for its level of confidence that the media hasn't been artificially manipulated. The company also developed a technology built into Microsoft Azure that enables a content producer to add metadata to a piece of content, as well as a reader that checks the metadata to let people know that the content is authentic.

"We expect that methods for generating synthetic media will continue to grow in sophistication. As all AI detection methods have rates of failure, we have to understand and be ready to respond to deepfakes that slip through detection methods," Microsoft CVP of customer security and trust Tom Burt wrote in a blog post last September. "Thus, in the longer term, we must seek stronger methods for maintaining and certifying the authenticity of news articles and other media."