UNICA IRIS Institutional Research Information System

Assessing the Quality of Experience (QoE) of multimedia services is crucial to ensure end users are satisfied. However, traditional feedback collection may suffer from bias due to the rating scale and may neglect the dynamic subjective nature of human perception, influenced by the user’s emotional state. This paper proposes an alternative solution to unobtrusively estimate QoE relying on features related to the user’s facial expressions and speech characteristics, which naturally reflect the user’s emotional state and enable QoE estimation without explicit feedback. The presented solution includes several steps, from the data collection process to the extraction and selection of the facial and speech features used to train the resulting QoE estimation models. Both single-modal and multi-modal learning approaches based on data fusion have been considered. These models can support the management of network and application resources of traditional and immersive multimedia services.

Facial and Speech-based Signal Processing Systems for Quality of Experience and Emotion Estimation of Multimedia Applications

Porcu, Simone;Floris, Alessandro

2025-01-01

Abstract

Assessing the Quality of Experience (QoE) of multimedia services is crucial to ensure end users are satisfied. However, traditional feedback collection may suffer from bias due to the rating scale and may neglect the dynamic subjective nature of human perception, influenced by the user’s emotional state. This paper proposes an alternative solution to unobtrusively estimate QoE relying on features related to the user’s facial expressions and speech characteristics, which naturally reflect the user’s emotional state and enable QoE estimation without explicit feedback. The presented solution includes several steps, from the data collection process to the extraction and selection of the facial and speech features used to train the resulting QoE estimation models. Both single-modal and multi-modal learning approaches based on data fusion have been considered. These models can support the management of network and application resources of traditional and immersive multimedia services.

Scheda breve

Scheda completa

Scheda completa (DC)

Anno

2025

Parole chiave

Quality of Experience
Facial Emotion Recognition
Speech Emotion Recognition
Multimedia

File in questo prodotto:

Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11584/478366

Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni

ND

ND

ND

social impact