UNICA IRIS Institutional Research Information System

Typosquatting consists of registering Internet domain names that closely resemble legitimate, reputable, and well-known ones (e.g., Farebook instead of Facebook). This cyber-attack aims to distribute malware or to phish the victims users (i.e., stealing their credentials) by mimicking the aspect of the legitimate webpage of the targeted organisation. The majority of the detection approaches proposed so far generate possible typo-variants of a legitimate domain, creating thus blacklists which can be used to prevent users from accessing typo-squatted domains. Only few studies have addressed the problem of Typosquatting detection by leveraging a passive Domain Name System (DNS) traffic analysis. In this work, we follow this approach, and additionally exploit machine learning to learn a similarity measure between domain names capable of detecting typo-squatted ones from the analyzed DNS traffic. We validate our approach on a large-scale dataset consisting of 4 months of traffic collected from a major Italian Internet Service Provider.

Deepsquatting: Learning-based typosquatting detection at deeper domain levels

Piredda, Paolo^Primo;Ariu, Davide;Biggio, Battista;Corona, Igino;Piras, Luca;Giacinto, Giorgio;Roli, Fabio^Ultimo

2017-01-01

Abstract

Typosquatting consists of registering Internet domain names that closely resemble legitimate, reputable, and well-known ones (e.g., Farebook instead of Facebook). This cyber-attack aims to distribute malware or to phish the victims users (i.e., stealing their credentials) by mimicking the aspect of the legitimate webpage of the targeted organisation. The majority of the detection approaches proposed so far generate possible typo-variants of a legitimate domain, creating thus blacklists which can be used to prevent users from accessing typo-squatted domains. Only few studies have addressed the problem of Typosquatting detection by leveraging a passive Domain Name System (DNS) traffic analysis. In this work, we follow this approach, and additionally exploit machine learning to learn a similarity measure between domain names capable of detecting typo-squatted ones from the analyzed DNS traffic. We validate our approach on a large-scale dataset consisting of 4 months of traffic collected from a major Italian Internet Service Provider.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2017
			
	Codice ISBN
	
				9783319701684
			
	Parole chiave
	
				Theoretical Computer Science; Computer Science (all)
			
	Tipologia:
	
				4.1 Contributo in Atti di convegno

File in questo prodotto:

File	Dimensione	Formato
piredda17-AIIA.pdf Solo gestori archivio Tipologia: versione pre-print Dimensione 1.23 MB Formato Adobe PDF Visualizza/Apri Richiedi una copia	1.23 MB	Adobe PDF	Visualizza/Apri Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11584/234540

Citazioni

ND

10

7

social impact