Android malware detectors are now widely implemented with machine learning algorithms, trained on large datasets of goodware and malware applications gathered at a fixed moment in time. However, as recent work showed, this domain is not stationary, causing detectors to show degrading performance over time. While recent work pinpoints the presence of such drift, little has been done to isolate its causes and understand the underlying reasons. In this work, we show which features cause the data drift, i.e., new features to appear and old ones that become unreliable. Our experimental evaluation highlights that particular feature groups cause the data drift. However, we also show that removing these highly variable features from the feature set is insufficient to achieve good classification performance.
Data drift in Android malware detection
Minnei, Luca;Eddoubi, Hicham;Sotgiu, Angelo;Pintor, Maura;Demontis, Ambra;Biggio, Battista
2025-01-01
Abstract
Android malware detectors are now widely implemented with machine learning algorithms, trained on large datasets of goodware and malware applications gathered at a fixed moment in time. However, as recent work showed, this domain is not stationary, causing detectors to show degrading performance over time. While recent work pinpoints the presence of such drift, little has been done to isolate its causes and understand the underlying reasons. In this work, we show which features cause the data drift, i.e., new features to appear and old ones that become unreliable. Our experimental evaluation highlights that particular feature groups cause the data drift. However, we also show that removing these highly variable features from the feature set is insufficient to achieve good classification performance.| File | Dimensione | Formato | |
|---|---|---|---|
|
Data_Drift_in_Android_Malware_Detection.pdf
Solo gestori archivio
Tipologia:
versione editoriale (VoR)
Dimensione
366.5 kB
Formato
Adobe PDF
|
366.5 kB | Adobe PDF | Visualizza/Apri Richiedi una copia |
|
ICMLC-drift-malware.pdf
accesso aperto
Tipologia:
versione pre-print
Dimensione
306.46 kB
Formato
Adobe PDF
|
306.46 kB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


