UNICA IRIS Institutional Research Information System

Bias in Web content influences public perception, fostering filter bubbles that limit exposure to diverse viewpoints and contribute to societal polarization. While eliminating such bias entirely is unrealistic, enhancing its transparency can empower readers to make informed decisions. Existing bias detection methods face challenges in accurately identifying specific bias types, and there is a lack of understanding on the extent to which content characteristics, such as text length and semantics, impact their adoption. In this paper, we explore the use of language models for fine-grained bias classification, focusing on three key objectives: estimating bias-level detection accuracy limits, investigating the effect of content length on model performance, and exploring the interpretability of the model predictions. Our results show that this class of models can detect various forms of bias well, with content length often having minimal impact. Additionally, model interpretations are often feasible, providing insights into the factors contributing to bias classification. These findings are applied within the InDi search engine—an open-source project part of the NGI Search initiative funded by the European Union—where bias detection models help re-rank search results to promote a more trustworthy information ecosystem. Source code for bias detection experimentation: https://github.com/ngi-indi/module-bias-gym.

Bias Classification and Interpretation in Web Content Via Language Models for Trustworthy Non-personalized Information Discovery

Bartoletti M.;Boratto L.;Marras M.;reforgiato Recupero D.;Rodriguez Arturo;Scarpi G.

2026-01-01

Abstract

Bias in Web content influences public perception, fostering filter bubbles that limit exposure to diverse viewpoints and contribute to societal polarization. While eliminating such bias entirely is unrealistic, enhancing its transparency can empower readers to make informed decisions. Existing bias detection methods face challenges in accurately identifying specific bias types, and there is a lack of understanding on the extent to which content characteristics, such as text length and semantics, impact their adoption. In this paper, we explore the use of language models for fine-grained bias classification, focusing on three key objectives: estimating bias-level detection accuracy limits, investigating the effect of content length on model performance, and exploring the interpretability of the model predictions. Our results show that this class of models can detect various forms of bias well, with content length often having minimal impact. Additionally, model interpretations are often feasible, providing insights into the factors contributing to bias classification. These findings are applied within the InDi search engine—an open-source project part of the NGI Search initiative funded by the European Union—where bias detection models help re-rank search results to promote a more trustworthy information ecosystem. Source code for bias detection experimentation: https://github.com/ngi-indi/module-bias-gym.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2026
			
	Codice ISBN
	
				9783032057471
9783032057488
			
	Parole chiave
	
				AI for social good
Explainable AI
Information retrieval
Language models
Media bias detection
Transparency
Trust
			
	Tipologia:
	
				4.1 Contributo in Atti di convegno

File in questo prodotto:

File	Dimensione	Formato
Improve_Privacy_and_Trust_in_Non_Personalized_Information_DIscovery__INDI_DCAI_ (1).pdf embargo fino al 02/04/2027 Tipologia: versione post-print (AAM) Dimensione 404.65 kB Formato Adobe PDF Visualizza/Apri Richiedi una copia	404.65 kB	Adobe PDF	Visualizza/Apri Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11584/480325

Citazioni

ND

0

ND

social impact