Malicious PDF les still constitute a serious threat to the systems security. New reader vulnerabilities have been discovered, and research has shown that current state of the art approaches can be easily bypassed by exploiting weaknesses caused by erroneous parsing or incomplete information extraction. In this work, we present a novel machine learning system to the detection of malicious PDF les. We have developed a static approach that leverages on information extracted by both the structure and the content of PDF les, which allows to improve the system robustness against evasion attacks. Experimental results show that our system is able to outperform all publicly available state of the art tools. We also report a signicant improvement of the performances at detecting reverse mimicry attacks, which are able to completely evade systems that only extract information from the PDF le structure. Finally, we claim that, to avoid targeted attacks, a more careful design of machine learning based detectors is needed.

An Evasion Resilient Approach to the Detection of Malicious PDF Files

MAIORCA, DAVIDE;ARIU, DAVIDE;CORONA, IGINO;GIACINTO, GIORGIO
2015-01-01

Abstract

Malicious PDF les still constitute a serious threat to the systems security. New reader vulnerabilities have been discovered, and research has shown that current state of the art approaches can be easily bypassed by exploiting weaknesses caused by erroneous parsing or incomplete information extraction. In this work, we present a novel machine learning system to the detection of malicious PDF les. We have developed a static approach that leverages on information extracted by both the structure and the content of PDF les, which allows to improve the system robustness against evasion attacks. Experimental results show that our system is able to outperform all publicly available state of the art tools. We also report a signicant improvement of the performances at detecting reverse mimicry attacks, which are able to completely evade systems that only extract information from the PDF le structure. Finally, we claim that, to avoid targeted attacks, a more careful design of machine learning based detectors is needed.
2015
978-3-319-27667-0
PDF, Evasion, Malware, Javascript, Machine Learning
File in questo prodotto:
File Dimensione Formato  
ICISSP_Chapter_Book_Printed_2015.pdf

Solo gestori archivio

Tipologia: versione editoriale
Dimensione 503.11 kB
Formato Adobe PDF
503.11 kB Adobe PDF   Visualizza/Apri   Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11584/133145
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 8
  • ???jsp.display-item.citation.isi??? 6
social impact