UNICA IRIS Institutional Research Information System

To detect the suspect poisoned data in the training phase, most backdoor defenses rely on a prevalent assumption, i.e., the feature separability between poisoned and benign samples. However, this assumption can be bypassed by novel adaptive attacks, which merge the features of poisoned and benign samples. In this paper, we contrast these adaptive attacks and propose a so-called Local-Feature-Powered Defense (LFPD), which leverages a local feature algorithm to measure samples' similarity in the image space and uses it to guide the training process to increase the feature sepa-rability between poisoned and benign samples. Then, our LFPD detects the outliers in the training dataset as poisoned samples and removes the backdoor by unlearning them. Finally, we compare our LFPD with five existing defenses, and our experimental results demonstrate that LFPD outperforms them in defending against adaptive attacks.

LFPD: Local-Feature-Powered Defense against adaptive backdoor attacks

Guo, Wei;Demontis, Ambra;Pintor, Maura;Chan, Patrick P. K.;Biggio, Battista

2025-01-01

Abstract

To detect the suspect poisoned data in the training phase, most backdoor defenses rely on a prevalent assumption, i.e., the feature separability between poisoned and benign samples. However, this assumption can be bypassed by novel adaptive attacks, which merge the features of poisoned and benign samples. In this paper, we contrast these adaptive attacks and propose a so-called Local-Feature-Powered Defense (LFPD), which leverages a local feature algorithm to measure samples' similarity in the image space and uses it to guide the training process to increase the feature sepa-rability between poisoned and benign samples. Then, our LFPD detects the outliers in the training dataset as poisoned samples and removes the backdoor by unlearning them. Finally, we compare our LFPD with five existing defenses, and our experimental results demonstrate that LFPD outperforms them in defending against adaptive attacks.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2025
			
	Parole chiave
	
				Adaptive attack; Backdoor defence; local feature
			
	Tipologia:
	
				4.1 Contributo in Atti di convegno

File in questo prodotto:

File	Dimensione	Formato
ICMLC-LFPD.pdf accesso aperto Descrizione: pre print Tipologia: versione pre-print Dimensione 732.02 kB Formato Adobe PDF Visualizza/Apri	732.02 kB	Adobe PDF	Visualizza/Apri
LFPD_Local-Feature-Powered_Defense_Against_Adaptive_Backdoor_Attacks.pdf Solo gestori archivio Tipologia: versione editoriale (VoR) Dimensione 769.9 kB Formato Adobe PDF Visualizza/Apri Richiedi una copia	769.9 kB	Adobe PDF	Visualizza/Apri Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11584/469665

Citazioni

ND

0

ND

social impact