UNICA IRIS Institutional Research Information System

While novel gradient-based attacks are continuously proposed to improve the optimization of adversarial examples, each is shown to outperform its predecessors using different experimental setups, implementations, and computational budgets, leading to biased and unfair comparisons. In this work, we overcome this issue by proposing AttackBench, i.e., an attack evaluation framework that evaluates the effectiveness of each attack (along with its different library implementations) under the same maximum available computational budget. To this end, we (i) define a novel optimality metric that quantifies how close each attack is to the optimal solution (empirically estimated by ensembling all attacks), and (ii) limit the maximum number of forward and backward queries that each attack can execute on the target model. Our extensive experimental analysis compares more than 100 attack implementations over 800 different configurations, considering both CIFAR-10 and ImageNet models, and shows that only a few attack implementations outperform all the remaining approaches. These findings suggest that novel defenses should be evaluated against different attacks than those normally used in the literature to avoid overly-optimistic robustness evaluations. We release AttackBench as a publicly-available benchmark that will be continuously updated with new attack implementations to maintain an up-to-date ranking of the best gradient-based attacks. We release AttackBench as a publicly available benchmark, including a continuously updated leaderboard and source code to maintain an up-to-date ranking of the best gradient-based attacks.

AttackBench: Evaluating Gradient-based Attacks for Adversarial Examples

Cinà, Antonio Emanuele;Rony, Jérôme;Pintor, Maura;Demetrio, Luca;Demontis, Ambra;Biggio, Battista;Ayed, Ismail Ben;Roli, Fabio

2025-01-01

Abstract

While novel gradient-based attacks are continuously proposed to improve the optimization of adversarial examples, each is shown to outperform its predecessors using different experimental setups, implementations, and computational budgets, leading to biased and unfair comparisons. In this work, we overcome this issue by proposing AttackBench, i.e., an attack evaluation framework that evaluates the effectiveness of each attack (along with its different library implementations) under the same maximum available computational budget. To this end, we (i) define a novel optimality metric that quantifies how close each attack is to the optimal solution (empirically estimated by ensembling all attacks), and (ii) limit the maximum number of forward and backward queries that each attack can execute on the target model. Our extensive experimental analysis compares more than 100 attack implementations over 800 different configurations, considering both CIFAR-10 and ImageNet models, and shows that only a few attack implementations outperform all the remaining approaches. These findings suggest that novel defenses should be evaluated against different attacks than those normally used in the literature to avoid overly-optimistic robustness evaluations. We release AttackBench as a publicly-available benchmark that will be continuously updated with new attack implementations to maintain an up-to-date ranking of the best gradient-based attacks. We release AttackBench as a publicly available benchmark, including a continuously updated leaderboard and source code to maintain an up-to-date ranking of the best gradient-based attacks.

Scheda breve

Scheda completa

Scheda completa (DC)

Anno

2025

Tipologia:

4.1 Contributo in Atti di convegno

File in questo prodotto:

File	Dimensione	Formato
32263-Article Text-36331-1-2-20250410-2.pdf accesso aperto Tipologia: versione editoriale (VoR) Dimensione 528.84 kB Formato Adobe PDF Visualizza/Apri	528.84 kB	Adobe PDF	Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11584/442405

Citazioni

ND

6

5

social impact