Bias in Web content influences public perception, fostering filter bubbles that limit exposure to diverse viewpoints and contribute to societal polarization. While eliminating such bias entirely is unrealistic, enhancing its transparency can empower readers to make informed decisions. Existing bias detection methods face challenges in accurately identifying specific bias types, and there is a lack of understanding on the extent to which content characteristics, such as text length and semantics, impact their adoption. In this paper, we explore the use of language models for fine-grained bias classification, focusing on three key objectives: estimating bias-level detection accuracy limits, investigating the effect of content length on model performance, and exploring the interpretability of the model predictions. Our results show that this class of models can detect various forms of bias well, with content length often having minimal impact. Additionally, model interpretations are often feasible, providing insights into the factors contributing to bias classification. These findings are applied within the InDi search engine—an open-source project part of the NGI Search initiative funded by the European Union—where bias detection models help re-rank search results to promote a more trustworthy information ecosystem. Source code for bias detection experimentation: https://github.com/ngi-indi/module-bias-gym.

Bias Classification and Interpretation in Web Content Via Language Models for Trustworthy Non-personalized Information Discovery

Bartoletti M.;Boratto L.;reforgiato Recupero D.
;
Rodriguez Arturo;Scarpi G.
2026-01-01

Abstract

Bias in Web content influences public perception, fostering filter bubbles that limit exposure to diverse viewpoints and contribute to societal polarization. While eliminating such bias entirely is unrealistic, enhancing its transparency can empower readers to make informed decisions. Existing bias detection methods face challenges in accurately identifying specific bias types, and there is a lack of understanding on the extent to which content characteristics, such as text length and semantics, impact their adoption. In this paper, we explore the use of language models for fine-grained bias classification, focusing on three key objectives: estimating bias-level detection accuracy limits, investigating the effect of content length on model performance, and exploring the interpretability of the model predictions. Our results show that this class of models can detect various forms of bias well, with content length often having minimal impact. Additionally, model interpretations are often feasible, providing insights into the factors contributing to bias classification. These findings are applied within the InDi search engine—an open-source project part of the NGI Search initiative funded by the European Union—where bias detection models help re-rank search results to promote a more trustworthy information ecosystem. Source code for bias detection experimentation: https://github.com/ngi-indi/module-bias-gym.
2026
9783032057471
9783032057488
AI for social good
Explainable AI
Information retrieval
Language models
Media bias detection
Transparency
Trust
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11584/480325
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact