Malicious PDF les still constitute a serious threat to the systems security. New reader vulnerabilities have been discovered, and research has shown that current state of the art approaches can be easily bypassed by exploiting weaknesses caused by erroneous parsing or incomplete information extraction. In this work, we present a novel machine learning system to the detection of malicious PDF les. We have developed a static approach that leverages on information extracted by both the structure and the content of PDF les, which allows to improve the system robustness against evasion attacks. Experimental results show that our system is able to outperform all publicly available state of the art tools. We also report a signicant improvement of the performances at detecting reverse mimicry attacks, which are able to completely evade systems that only extract information from the PDF le structure. Finally, we claim that, to avoid targeted attacks, a more careful design of machine learning based detectors is needed.
An Evasion Resilient Approach to the Detection of Malicious PDF Files
MAIORCA, DAVIDE;ARIU, DAVIDE;CORONA, IGINO;GIACINTO, GIORGIO
2015-01-01
Abstract
Malicious PDF les still constitute a serious threat to the systems security. New reader vulnerabilities have been discovered, and research has shown that current state of the art approaches can be easily bypassed by exploiting weaknesses caused by erroneous parsing or incomplete information extraction. In this work, we present a novel machine learning system to the detection of malicious PDF les. We have developed a static approach that leverages on information extracted by both the structure and the content of PDF les, which allows to improve the system robustness against evasion attacks. Experimental results show that our system is able to outperform all publicly available state of the art tools. We also report a signicant improvement of the performances at detecting reverse mimicry attacks, which are able to completely evade systems that only extract information from the PDF le structure. Finally, we claim that, to avoid targeted attacks, a more careful design of machine learning based detectors is needed.File | Dimensione | Formato | |
---|---|---|---|
ICISSP_Chapter_Book_Printed_2015.pdf
Solo gestori archivio
Tipologia:
versione editoriale (VoR)
Dimensione
503.11 kB
Formato
Adobe PDF
|
503.11 kB | Adobe PDF | Visualizza/Apri Richiedi una copia |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.