Scientists find a way to detect AI deepfake videos


Scientists are arming AI against AI in a quest to identify deepfake videos.

AI-generated content has been prevalent for some time, but video generation has recently advanced to new heights. Since its release in February, OpenAI’s Sora has been stunning with its hyper-realistic synthetic content, all generated using text prompts. Such is its capacity to convince people of its authenticity, generative content is now causing rising concerns regarding the spread of misinformation.

The Multimedia and Information Security Lab (MISL) in Drexel’s College of Engineering has been working for more than a decade on creating technologies to manipulate imagery.

ADVERTISEMENT

The currently existing methods have been ineffective against AI-generated video. The scientists evaluated 11 synthetic image detectors available to the public. These programs showed high effectiveness, with at least 90% accuracy in detecting manipulated images. However, their performance decreased by 20-30% when tasked with identifying videos generated by publicly available AI generators.

“It’s more than a bit unnerving that this video technology could be released before there’s a good system for detecting fakes created by bad actors,” Matthew Stamm, PhD, an associate professor in Drexel’s College of Engineering and director of the MISL, said in a press release.

Stemm thinks that once the technology is publicly available, malicious usage is inevitable. “That’s why we’re working to stay ahead of them by developing the technology to identify synthetic videos from patterns and traits that are endemic to the media,” he adds.

Until recently, manipulated imagery relied on photo and video editing programs to alter pixels, adjust speed, or manipulate frames. However, these edits leave digital traces that scientists at MISL have effectively identified using their suite of tools.

Deepfake detection
Top row: video frames taken from AI-generated videos. Bottom row: Fourier transforms of the residual forensic traces for each corresponding frame above | Source: Research

While the text-to-video generators have not been produced by a camera and were not edited by visual software, they pose new challenges in detecting manipulation. In MISL's latest research, scientists decided to use AI against AI to identify how generative AI programs construct their videos. They successfully trained a machine learning algorithm called a constrained neural network.

The network was able to learn what a synthetic video looks like at a granular level and apply that knowledge to a new set of videos generated with AI video generators, such as Stable Video Diffusion, Video-Crafter, and Cog-Video, as well as previously unknown programs.

The algorithm was more than 93% effective at identifying synthetic videos and accurately identifying the program that was used to create them.

ADVERTISEMENT