UNICA IRIS Institutional Research Information System

Detecting deepfake videos remains a challenging task, especially in scenarios involving unknown manipulation methods or unseen data distributions. Most existing video deepfake detection methods rely on high-level semantic features, which often lead to overfitting of facial identity information and poor transferability. In this work, we explore a novel perspective by modeling videos through 3D differential operations along temporal and spatial dimensions. To exploit the spatial–temporal variation information of the video content, the proposed approach decomposes videos into single-axis 1D differential signals, which are then transformed into 2D representations for efficient learning. This procedure enables the use of lightweight 2D CNNs while retaining directional forgery cues. Our experiments, aimed at analyzing whether these differential signals capture discriminative patterns useful for distinguishing real from fake content, show that the proposed method achieves strong intra-dataset performance and reveals complementary information across dimensions. These findings suggest that differential signals could potentially support generalization when integrated into broader detection frameworks.

3D differential decomposition for video deepfake detection with identity suppression

Jie Gao;Marco Micheletto;Giulia Orru';Xiaoyi Feng;Gian Luca Marcialis

2026-01-01

Abstract

Detecting deepfake videos remains a challenging task, especially in scenarios involving unknown manipulation methods or unseen data distributions. Most existing video deepfake detection methods rely on high-level semantic features, which often lead to overfitting of facial identity information and poor transferability. In this work, we explore a novel perspective by modeling videos through 3D differential operations along temporal and spatial dimensions. To exploit the spatial–temporal variation information of the video content, the proposed approach decomposes videos into single-axis 1D differential signals, which are then transformed into 2D representations for efficient learning. This procedure enables the use of lightweight 2D CNNs while retaining directional forgery cues. Our experiments, aimed at analyzing whether these differential signals capture discriminative patterns useful for distinguishing real from fake content, show that the proposed method achieves strong intra-dataset performance and reveals complementary information across dimensions. These findings suggest that differential signals could potentially support generalization when integrated into broader detection frameworks.

Scheda breve

Scheda completa

Scheda completa (DC)

Anno di pubblicazione

2026

Tipologia:

1.1 Articolo in rivista

File in questo prodotto:

File	Dimensione	Formato
1-s2.0-S0923596526000482-main.pdf accesso aperto Tipologia: versione editoriale (VoR) Dimensione 4.2 MB Formato Adobe PDF Visualizza/Apri	4.2 MB	Adobe PDF	Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11584/485365

Citazioni

ND

0

0

social impact