Network management is crucial to ensuring adequate Quality of Experience (QoE) for all connected users. In recent years, artificial intelligence (AI) has been leveraged to support the development of accurate QoE prediction models. In this paper, we leverage the learning characteristics of transformer architectures to implement a novel transformer-based model for estimating the QoE of video streaming services, the most consumed media services on the Internet. Dedicated steps have been defined to collect, encode, and sequentalize the data concerning streaming session-related key performance indicators (KPIs). These steps are needed to prepare the data in a sequential form appropriate for the transformer's learning processes. The model has been trained using two open datasets from the ITU-T P.1203 standardization procedure, including 82 videos (watched and rated on mobile and PC devices) impaired by video quality switching, delay, and stalling events. The proposed model outperforms the P.1203 model in predicting the QoE rated on both mobile and PC devices in terms of root mean square error (RMSE), Pearson correlation coefficient (PCC), and Spearman correlation coefficient (SCC). Moreover, our model achieved robust performance in the cross-device scenario, i.e., it accurately estimates the QoE of a video watched on a device different from that used for training the model.
A transformer-based modelling approach for robust QoE estimation in video streaming
Hamidi, Mohammad Ali;Porcu, Simone;Floris, Alessandro;Atzori, Luigi
2025-01-01
Abstract
Network management is crucial to ensuring adequate Quality of Experience (QoE) for all connected users. In recent years, artificial intelligence (AI) has been leveraged to support the development of accurate QoE prediction models. In this paper, we leverage the learning characteristics of transformer architectures to implement a novel transformer-based model for estimating the QoE of video streaming services, the most consumed media services on the Internet. Dedicated steps have been defined to collect, encode, and sequentalize the data concerning streaming session-related key performance indicators (KPIs). These steps are needed to prepare the data in a sequential form appropriate for the transformer's learning processes. The model has been trained using two open datasets from the ITU-T P.1203 standardization procedure, including 82 videos (watched and rated on mobile and PC devices) impaired by video quality switching, delay, and stalling events. The proposed model outperforms the P.1203 model in predicting the QoE rated on both mobile and PC devices in terms of root mean square error (RMSE), Pearson correlation coefficient (PCC), and Spearman correlation coefficient (SCC). Moreover, our model achieved robust performance in the cross-device scenario, i.e., it accurately estimates the QoE of a video watched on a device different from that used for training the model.| File | Dimensione | Formato | |
|---|---|---|---|
|
[pub] A_Transformer-Based_Modelling_Approach.pdf
Solo gestori archivio
Descrizione: VoR
Tipologia:
versione editoriale (VoR)
Dimensione
315.86 kB
Formato
Adobe PDF
|
315.86 kB | Adobe PDF | Visualizza/Apri Richiedi una copia |
|
469450_AAM_.pdf
accesso aperto
Descrizione: AAM
Tipologia:
versione post-print (AAM)
Dimensione
1.04 MB
Formato
Adobe PDF
|
1.04 MB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


