UNICA IRIS Institutional Research Information System

Automated speaker recognition is enabling personalized interactions with the voice-based interfaces and assistants part of the modern cyber-physical-social systems. Prior studies have unfortunately uncovered disparate impacts across demographic groups on the outcomes of speaker recognition systems and consequently proposed a range of countermeasures. Understanding why a speaker recognition system may lead to this disparate performance for different (groups of) individuals, going beyond mere data imbalance reasons and black-box countermeasures, is an essential yet under-explored perspective. In this paper, we propose an explanatory framework that aims to provide a better understanding of how speaker recognition models perform as the underlying voice characteristics on which they are tested change. With our framework, we evaluate two state-of-the-art speaker recognition models, comparing their fairness in terms of security, through a systematic analysis of the impact of more than twenty voice characteristics. Our findings include important takeaways to enable voice controlled cyber-physical-social systems for everyone. Source code and data are available at https://bit.ly/EA-PRLETTERS.

Causal reasoning for algorithmic fairness in voice controlled cyber-physical systems

Fenu G.;Marras M.;Medda G.;Meloni G.

2023-01-01

Abstract

Automated speaker recognition is enabling personalized interactions with the voice-based interfaces and assistants part of the modern cyber-physical-social systems. Prior studies have unfortunately uncovered disparate impacts across demographic groups on the outcomes of speaker recognition systems and consequently proposed a range of countermeasures. Understanding why a speaker recognition system may lead to this disparate performance for different (groups of) individuals, going beyond mere data imbalance reasons and black-box countermeasures, is an essential yet under-explored perspective. In this paper, we propose an explanatory framework that aims to provide a better understanding of how speaker recognition models perform as the underlying voice characteristics on which they are tested change. With our framework, we evaluate two state-of-the-art speaker recognition models, comparing their fairness in terms of security, through a systematic analysis of the impact of more than twenty voice characteristics. Our findings include important takeaways to enable voice controlled cyber-physical-social systems for everyone. Source code and data are available at https://bit.ly/EA-PRLETTERS.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno di pubblicazione
	
				2023
			
	Parole chiave
	
				Authentication; Fairness; Security; Speaker recognition; Voice biometrics
			
	Tipologia:
	
				1.1 Articolo in rivista

File in questo prodotto:

File	Dimensione	Formato
1-s2.0-S016786552300079X-main.pdf accesso aperto Tipologia: versione editoriale (VoR) Dimensione 1.23 MB Formato Adobe PDF Visualizza/Apri	1.23 MB	Adobe PDF	Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11584/432656

Citazioni

ND

2

1

social impact