UNICA IRIS Institutional Research Information System

Malware still constitutes a major threat in the cybersecurity landscape, also due to the widespread use of infection vectors such as documents. These infection vectors hide embedded malicious code to the victim users, facilitating the use of social engineering techniques to infect their machines. Research showed that machine-learning algorithms provide effective detection mechanisms against such threats, but the existence of an arms race in adversarial settings has recently challenged such systems. In this work, we focus on malware embedded in PDF files as a representative case of such an arms race. We start by providing a comprehensive taxonomy of the different approaches used to generate PDF malware and of the corresponding learning-based detection systems. We then categorize threats specifically targeted against learning-based PDF malware detectors using a well-established framework in the field of adversarial machine learning. This framework allows us to categorize known vulnerabilities of learning-based PDF malware detectors and to identify novel attacks that may threaten such systems, along with the potential defense mechanisms that can mitigate the impact of such threats. We conclude the article by discussing how such findings highlight promising research directions towards tackling the more general challenge of designing robust malware detectors in adversarial settings.

Towards adversarial malware detection: lessons learned from PDF-based attacks

Maiorca D.^Primo;Biggio B.^Secondo;Giacinto G.^Ultimo

2020-01-01

Abstract

Malware still constitutes a major threat in the cybersecurity landscape, also due to the widespread use of infection vectors such as documents. These infection vectors hide embedded malicious code to the victim users, facilitating the use of social engineering techniques to infect their machines. Research showed that machine-learning algorithms provide effective detection mechanisms against such threats, but the existence of an arms race in adversarial settings has recently challenged such systems. In this work, we focus on malware embedded in PDF files as a representative case of such an arms race. We start by providing a comprehensive taxonomy of the different approaches used to generate PDF malware and of the corresponding learning-based detection systems. We then categorize threats specifically targeted against learning-based PDF malware detectors using a well-established framework in the field of adversarial machine learning. This framework allows us to categorize known vulnerabilities of learning-based PDF malware detectors and to identify novel attacks that may threaten such systems, along with the potential defense mechanisms that can mitigate the impact of such threats. We conclude the article by discussing how such findings highlight promising research directions towards tackling the more general challenge of designing robust malware detectors in adversarial settings.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno di pubblicazione
	
				2020
			
	Parole chiave
	
				Evasion attacks; Infection vectors; Javascript; Machine learning; PDF files; Vulnerabilities
			
	Tipologia:
	
				1.1 Articolo in rivista

File in questo prodotto:

File	Dimensione	Formato
a78-maiorca.pdf Solo gestori archivio Descrizione: articolo Tipologia: versione editoriale (VoR) Dimensione 1.27 MB Formato Adobe PDF Visualizza/Apri Richiedi una copia	1.27 MB	Adobe PDF	Visualizza/Apri Richiedi una copia
CSUR_IRIS.pdf accesso aperto Tipologia: versione post-print (AAM) Dimensione 952.96 kB Formato Adobe PDF Visualizza/Apri	952.96 kB	Adobe PDF	Visualizza/Apri

I metadati presenti in IRIS UNICA sono rilasciati con licenza Creative Commons CC0 1.0 Universal, mentre i file delle pubblicazioni sono protetti da diritto d'autore, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11584/277858

Citazioni

ND

97

67

ND

social impact