Deepfakes leverage artificial intelligence to generate highly realistic but falsified visual content, raising concerns for security and trust in digital media. Detecting such manipulations becomes more challenging when videos are compressed, as compression algorithms introduce artifacts that obscure forensic evidence. One possible solution is to train separate models for different compression levels; however, this approach increases computational costs and limits scalability. To address this challenge, we introduce a unified framework designed to improve robustness against varying degrees of video compression. Our approach combines (i) a dedicated MPEG-based augmentation strategy tailored for compressed videos, and (ii) two architectural designs named Multi-Head (MHN) and the Multi-Branch Network (MBN). The MHN extends a standard backbone by appending lightweight output layers, or "heads", that jointly predict deepfake likelihood and compression level, enabling compression-aware detection with minimal architectural changes. The MBN combines multiple MHNs into a modular, parallel architecture, offering an alternative to conventional depth-based model scaling. Experiments on the FaceForensics++ and Celeb-DF datasets show that both MHN and MBN improve detection performance in compressed scenarios. Notably, MHN applied to a lightweight backbone outperforms deeper and more complex models without the multi-head extension, making the proposed solution well-suited for deployment in resource-constrained settings.

Robust deepfake detection in compressed videos with scalable network strategies

Perelli G.;Micheletto M.
;
Puglisi G.;Luca Marcialis G.
2026-01-01

Abstract

Deepfakes leverage artificial intelligence to generate highly realistic but falsified visual content, raising concerns for security and trust in digital media. Detecting such manipulations becomes more challenging when videos are compressed, as compression algorithms introduce artifacts that obscure forensic evidence. One possible solution is to train separate models for different compression levels; however, this approach increases computational costs and limits scalability. To address this challenge, we introduce a unified framework designed to improve robustness against varying degrees of video compression. Our approach combines (i) a dedicated MPEG-based augmentation strategy tailored for compressed videos, and (ii) two architectural designs named Multi-Head (MHN) and the Multi-Branch Network (MBN). The MHN extends a standard backbone by appending lightweight output layers, or "heads", that jointly predict deepfake likelihood and compression level, enabling compression-aware detection with minimal architectural changes. The MBN combines multiple MHNs into a modular, parallel architecture, offering an alternative to conventional depth-based model scaling. Experiments on the FaceForensics++ and Celeb-DF datasets show that both MHN and MBN improve detection performance in compressed scenarios. Notably, MHN applied to a lightweight backbone outperforms deeper and more complex models without the multi-head extension, making the proposed solution well-suited for deployment in resource-constrained settings.
2026
Deepfake detection; Biometrics; Pattern recognition; Computer vision
File in questo prodotto:
File Dimensione Formato  
1-s2.0-S0957417426006743-main.pdf

accesso aperto

Tipologia: versione editoriale (VoR)
Dimensione 8.01 MB
Formato Adobe PDF
8.01 MB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11584/485366
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? 0
social impact