Deepfake detectors and datasets exhibit racial and gender bias, USC study shows

Some experts have expressed concern that machine learning tools could be used to create deepfakes, or videos that take a person in an existing video and replace them with someone else's likeness. The fear is that these fakes might be used to do things like sway opinion during an election or implicate a person in a crime. Already, deepfakes have been abused to generate pornographic material of actors and defraud a major energy producer.

Fortunately, efforts are underway to develop automated methods to detect deepfakes. Facebook -- along with Amazon and Microsoft, among others -- spearheaded the Deepfake Detection Challenge, which ended last June. The challenge's launch came after the release of a large corpus of visual deepfakes produced in collaboration with Jigsaw, Google's internal technology incubator, which was incorporated into a benchmark made freely available to researchers for synthetic video detection system development. More recently, Microsoft launched its own deepfake-combating solution in Video Authenticator, a system that can analyze a still photo or video to provide a score for its level of confidence that the media hasn't been artificially manipulated.

But according to researchers at the University of Southern California, some of the datasets used to train deepfake detection systems might underrepresent people of a certain gender or with specific skin colors. This bias can be amplified in deepfake detectors, the coauthors say, with some detectors showing up to a 10.7% difference in error rate depending on the racial group.

Biased deepfake detectors

The results, while perhaps surprising to some, are in line with previous research showing that computer vision models are susceptible to harmful, pervasive prejudice. A paper last fall by University of Colorado, Boulder researchers demonstrated that AI from Amazon, Clarifai, Microsoft, and others maintained accuracy rates above 95% for cisgender men and women but misidentified trans men as women 38% of the time. Independent benchmarks of major vendors' systems by the Gender Shades project and the National Institute of Standards and Technology (NIST) have demonstrated that facial recognition technology exhibits racial and gender bias and have suggested that current facial recognition programs can be wildly inaccurate, misclassifying people upwards of 96% of the time.

The University of Southern California group looked at three deepfake detection models with "proven success in detecting deepfake videos." All were trained on the FaceForensics++ dataset, which is commonly used for deepfake detectors, as well as corpora including Google's DeepfakeDetection, CelebDF, and DeeperForensics-1.0.

In a benchmark test, the researchers found that all of the detectors performed worst on videos with darker Black faces, especially male Black faces. Videos with female Asian faces had the highest accuracy, but depending on the dataset, the detectors also performed well on Caucasian (particularly male) and Indian faces.

According to the researchers, the deepfake detection datasets were "strongly" imbalanced in terms of gender and racial groups, with FaceForensics++ sample videos showing over 58% (mostly white) women compared with 41.7% men. Less than 5% of the real videos showed Black or Indian people, and the datasets contained "irregular swaps," where a person's face was swapped onto another person of a different race or gender.

These irregular swaps, while intended to mitigate bias, are in fact to blame for at least a portion of the bias in the detectors, the coauthors hypothesize. Trained on the datasets, the detectors learned correlations between fakeness and, for example, Asian facial features. One corpus used Asian faces as foreground faces swapped onto female Caucasian faces and female Hispanic faces.

"In a real-world scenario, facial profiles of female Asian or female African are 1.5 to 3 times more likely to be mistakenly labeled as fake than profiles of the male Caucasian ... The proportion of real subjects mistakenly identified as fake can be much larger for female subjects than male subjects," the researchers wrote.

Real-world risks

The findings are a stark reminder that even the "best" AI systems aren't necessarily flawless. As the coauthors note, at least one deepfake detector in the study achieved 90.1% accuracy on a test dataset, a metric that conceals the biases within.

"[U]sing a single performance metrics such as ... detection accuracy over the entire dataset is not enough to justify massive commercial rollouts of deepfake detectors," the researchers wrote. "As deepfakes become more pervasive, there is a growing reliance on automated systems to combat deepfakes. We argue that practitioners should investigate all societal aspects and consequences of these high impact systems."

The research is especially timely in light of growth in the commercial deepfake video detection market. Amsterdam-based Sensity (formerly Deeptrace Labs) offers a suite of monitoring products that purport to classify deepfakes uploaded on social media, video hosting platforms, and disinformation networks. Dessa has proposed techniques for improving deepfake detectors trained on data sets of manipulated videos. And Truepic raised an $8 million funding round in July 2018 for its video and photo deepfake detection services. In December 2018, the company acquired another deepfake "detection-as-a-service" startup -- Fourandsix -- whose fake image detector was licensed by DARPA.