The ever-increasing use of services based on computer networks, even in crucial areas unthinkable until a few years ago, has made the security of these networks a crucial element for anyone, also in consideration of the increasingly sophisticated techniques and strategies available to attackers. In this context, Intrusion Detection Systems (IDSs) play a primary role since they are responsible for analyzing and classifying each network activity as legitimate or illegitimate, allowing us to take the necessary countermeasures at the appropriate time. However, these systems are not infallible due to several reasons, the most important of which are the constant evolution of the attacks (e.g., zero-day attacks) and the problem that many of the attacks have behavior similar to those of legitimate activities, and therefore they are very hard to identify. This work relies on the hypothesis that the subdivision of the training data used for the IDS classification model definition into a certain number of partitions, in terms of events and features, can improve the characterization of the network events, improving the system performance. The non-overlapping data partitions train independent classification models, classifying the event according to a majority-voting rule. A series of experiments conducted on a benchmark real-world dataset support the initial hypothesis, showing a performance improvement with respect to a canonical training approach.

Leveraging the Training Data Partitioning to Improve Events Characterization in Intrusion Detection Systems

Saia R.
;
Carta S.;Fenu G.;Pompianu L.
2023-01-01

Abstract

The ever-increasing use of services based on computer networks, even in crucial areas unthinkable until a few years ago, has made the security of these networks a crucial element for anyone, also in consideration of the increasingly sophisticated techniques and strategies available to attackers. In this context, Intrusion Detection Systems (IDSs) play a primary role since they are responsible for analyzing and classifying each network activity as legitimate or illegitimate, allowing us to take the necessary countermeasures at the appropriate time. However, these systems are not infallible due to several reasons, the most important of which are the constant evolution of the attacks (e.g., zero-day attacks) and the problem that many of the attacks have behavior similar to those of legitimate activities, and therefore they are very hard to identify. This work relies on the hypothesis that the subdivision of the training data used for the IDS classification model definition into a certain number of partitions, in terms of events and features, can improve the characterization of the network events, improving the system performance. The non-overlapping data partitions train independent classification models, classifying the event according to a majority-voting rule. A series of experiments conducted on a benchmark real-world dataset support the initial hypothesis, showing a performance improvement with respect to a canonical training approach.
2023
intrusion detection; network security; training data; algorithm
File in questo prodotto:
File Dimensione Formato  
JAIT-V14N6-1345.pdf

accesso aperto

Tipologia: versione editoriale (VoR)
Dimensione 1.67 MB
Formato Adobe PDF
1.67 MB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11584/385084
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? ND
social impact