UNICA IRIS Institutional Research Information System

In this paper, we propose dictionary attacks against speaker verification - a novel attack vector that aims to match a large fraction of speaker population by chance. We introduce a generic formulation of the attack that can be used with various speech representations and threat models. The attacker uses adversarial optimization to maximize raw similarity of speaker embeddings between a seed speech sample and a proxy population. The resulting master voice successfully matches a non-trivial fraction of people in an unknown population. Adversarial waveforms obtained with our approach can match on average 69% of females and 38% of males enrolled in the target system at a strict decision threshold calibrated to yield false alarm rate of 1%. By using the attack with a black-box voice cloning system, we obtain master voices that are effective in the most challenging conditions and transferable between speaker encoders. We also show that, combined with multiple attempts, this attack opens even more to serious issues on the security of these systems.

Dictionary Attacks on Speaker Verification

Marras M.;Korus P.;Jain A.;Memon N.

2023-01-01

Abstract

In this paper, we propose dictionary attacks against speaker verification - a novel attack vector that aims to match a large fraction of speaker population by chance. We introduce a generic formulation of the attack that can be used with various speech representations and threat models. The attacker uses adversarial optimization to maximize raw similarity of speaker embeddings between a seed speech sample and a proxy population. The resulting master voice successfully matches a non-trivial fraction of people in an unknown population. Adversarial waveforms obtained with our approach can match on average 69% of females and 38% of males enrolled in the target system at a strict decision threshold calibrated to yield false alarm rate of 1%. By using the attack with a black-box voice cloning system, we obtain master voices that are effective in the most challenging conditions and transferable between speaker encoders. We also show that, combined with multiple attempts, this attack opens even more to serious issues on the security of these systems.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno di pubblicazione
	
				2023
			
	Parole chiave
	
				Dictionaries; Fingerprint recognition; Optimization; Perturbation methods; Psychoacoustic models; Psychoacoustic models; Sociology; Statistics
			
	Tipologia:
	
				1.1 Articolo in rivista

File in questo prodotto:

File	Dimensione	Formato
Dictionary_Attacks_on_Speaker_Verification.pdf Solo gestori archivio Tipologia: versione editoriale (VoR) Dimensione 7.8 MB Formato Adobe PDF Visualizza/Apri Richiedi una copia	7.8 MB	Adobe PDF	Visualizza/Apri Richiedi una copia
TIFS-Marras_public.pdf accesso aperto Tipologia: versione post-print (AAM) Dimensione 5.38 MB Formato Adobe PDF Visualizza/Apri	5.38 MB	Adobe PDF	Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11584/352363

Citazioni

ND

9

8

social impact