In urban scenarios, biometric recognition technologies are being increasingly adopted to empower citizens with a secure and usable access to personalized services. Given the challenging environmental scenarios, combining evidence from multiple biometrics at a certain step of the recognition pipeline has been often proved to increase the performance of the biometric-enabled recognition system. Despite the increasing accuracy achieved so far, it still remains under-explored how the adopted biometric fusion policy impacts on the quality of the decisions made by the biometric system, depending on the demographic characteristics of the citizen under consideration. In this paper, we investigate the extent to which state-of-the-art multimodal recognition systems based on facial and vocal biometrics are susceptible to unfairness towards legally-protected groups of individuals, characterized by a common sensitive attribute. Specifically, we present a comparative analysis of the performance across groups for two deep learning architectures tailored for facial and vocal recognition, under seven fusion policies that cover different pipeline steps (feature, model, score and decision). Experiments show that, compared to the unimodal systems alone and the other fusion policies, the multimodal system obtained via a fusion at the model step leads to the highest overall accuracy and the lowest disparity across groups.

Demographic Fairness in Multimodal Biometrics: A Comparative Analysis on Audio-Visual Speaker Recognition Systems

Fenu G.;Marras M.
2022-01-01

Abstract

In urban scenarios, biometric recognition technologies are being increasingly adopted to empower citizens with a secure and usable access to personalized services. Given the challenging environmental scenarios, combining evidence from multiple biometrics at a certain step of the recognition pipeline has been often proved to increase the performance of the biometric-enabled recognition system. Despite the increasing accuracy achieved so far, it still remains under-explored how the adopted biometric fusion policy impacts on the quality of the decisions made by the biometric system, depending on the demographic characteristics of the citizen under consideration. In this paper, we investigate the extent to which state-of-the-art multimodal recognition systems based on facial and vocal biometrics are susceptible to unfairness towards legally-protected groups of individuals, characterized by a common sensitive attribute. Specifically, we present a comparative analysis of the performance across groups for two deep learning architectures tailored for facial and vocal recognition, under seven fusion policies that cover different pipeline steps (feature, model, score and decision). Experiments show that, compared to the unimodal systems alone and the other fusion policies, the multimodal system obtained via a fusion at the model step leads to the highest overall accuracy and the lowest disparity across groups.
2022
Bias; Biometric Authentication; Biometrics; Discrimination; Face Recognition; Fairness; Multimodal; Voice Recognition
File in questo prodotto:
File Dimensione Formato  
1-s2.0-S1877050921024753-main.pdf

accesso aperto

Tipologia: versione editoriale
Dimensione 474.31 kB
Formato Adobe PDF
474.31 kB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11584/352364
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 7
  • ???jsp.display-item.citation.isi??? ND
social impact