UNICA IRIS Institutional Research Information System

Machine-learning phishing webpage detectors (ML-PWD) have been shown to suffer from adversarial manipulations of the HTML code of the input webpage. Nevertheless, the attacks recently proposed have demonstrated limited effectiveness due to their lack of optimizing the usage of the adopted manipulations, and they focus solely on specific elements of the HTML code. In this work, we overcome these limitations by first designing a novel set of fine-grained manipulations which allow to modify the HTML code of the input phishing webpage without compromising its maliciousness and visual appearance, i.e., the manipulations are functionality- and rendering-preserving by design. We then select which manipulations should be applied to bypass the target detector by a query-efficient black-box optimization algorithm. Our experiments show that our attacks are able to raze to the ground the performance of current state-of-the-art ML-PWD using just 30 queries, thus overcoming the weaker attacks developed in previous work, and enabling a much fairer robustness evaluation of ML-PWD.

Raze to the ground: query-efficient adversarial HTML attacks on machine-learning phishing webpage detectors

Montaruli, Biagio;Demetrio, Luca;Pintor, Maura;Compagna, Luca;Balzarotti, Davide;Biggio, Battista

2023-01-01

Abstract

Machine-learning phishing webpage detectors (ML-PWD) have been shown to suffer from adversarial manipulations of the HTML code of the input webpage. Nevertheless, the attacks recently proposed have demonstrated limited effectiveness due to their lack of optimizing the usage of the adopted manipulations, and they focus solely on specific elements of the HTML code. In this work, we overcome these limitations by first designing a novel set of fine-grained manipulations which allow to modify the HTML code of the input phishing webpage without compromising its maliciousness and visual appearance, i.e., the manipulations are functionality- and rendering-preserving by design. We then select which manipulations should be applied to bypass the target detector by a query-efficient black-box optimization algorithm. Our experiments show that our attacks are able to raze to the ground the performance of current state-of-the-art ML-PWD using just 30 queries, thus overcoming the weaker attacks developed in previous work, and enabling a much fairer robustness evaluation of ML-PWD.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2023
			
	Codice ISBN
	
				979-8-4007-0260-0
			
	Parole chiave
	
				machine learning; phishing; adversarial attacks
			
	Tipologia:
	
				4.1 Contributo in Atti di convegno

File in questo prodotto:

File	Dimensione	Formato
aisec127-montaruli.pdf Solo gestori archivio Tipologia: versione editoriale (VoR) Dimensione 798.12 kB Formato Adobe PDF Visualizza/Apri Richiedi una copia	798.12 kB	Adobe PDF	Visualizza/Apri Richiedi una copia
Raze_to_the_Ground__Query_Efficient_Adversarial_HTML_Attacks_on_Machine_Learning_Phishing_Webpage_Detectors.pdf accesso aperto Tipologia: versione post-print (AAM) Dimensione 997.07 kB Formato Adobe PDF Visualizza/Apri	997.07 kB	Adobe PDF	Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11584/393063

Citazioni

ND

5

4

social impact