Evaluating Fine-Tuned LLMs for AI Text Detection

Murgia, M.; Reforgiato Recupero, D.; Spathoulas, G.

doi:10.1007/978-3-032-00642-4_17

To counter threats like disinformation, which are amplified by Large Language Models (LLMs) generating human-like text, robust detection systems are essential. Such systems need to be effective across diverse domains and resilient to adversarial manipulations like paraphrasing. We investigated the use of fine-tuned LLMs for AI-generated text detection, evaluating supervised fine-tuning (SFT) across varying domain coverage during training. These were tested by focusing on their generalization capabilities when confronted with entirely unseen data sources (domains and generator models) and their robustness against adversarial manipulations. The findings indicate this methodology is promising, achieving performance comparable or superior to current literature methods when employing multi-domain training.

Evaluating Fine-Tuned LLMs for AI Text Detection

Murgia M.;Reforgiato Recupero D.;Spathoulas G.

2025-01-01

Abstract

To counter threats like disinformation, which are amplified by Large Language Models (LLMs) generating human-like text, robust detection systems are essential. Such systems need to be effective across diverse domains and resilient to adversarial manipulations like paraphrasing. We investigated the use of fine-tuned LLMs for AI-generated text detection, evaluating supervised fine-tuning (SFT) across varying domain coverage during training. These were tested by focusing on their generalization capabilities when confronted with entirely unseen data sources (domains and generator models) and their robustness against adversarial manipulations. The findings indicate this methodology is promising, achieving performance comparable or superior to current literature methods when employing multi-domain training.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2025
			
	Codice ISBN
	
				9783032006417
9783032006424
			
	Parole chiave
	
				Adversarial Manipulation
AI-Generated Text Detection
Domain Generalization
Large Language Models
Robustness
			
	Tipologia:
	
				4.1 Contributo in Atti di convegno

File in questo prodotto:

File	Dimensione	Formato
A_novel_approach_for_AI_detection_content_based_on_fine_tuned_LLMs__LCNS___Workshop_ (1).pdf embargo fino al 10/08/2026 Tipologia: versione post-print (AAM) Dimensione 357.56 kB Formato Adobe PDF Visualizza/Apri Richiedi una copia	357.56 kB	Adobe PDF	Visualizza/Apri Richiedi una copia

I metadati presenti in IRIS UNICA sono rilasciati con licenza Creative Commons CC0 1.0 Universal, mentre i file delle pubblicazioni sono protetti da diritto d'autore, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11584/480229

Citazioni

ND

0

ND

ND

UNICA IRIS Institutional Research Information System

Evaluating Fine-Tuned LLMs for AI Text Detection

Murgia M.;Reforgiato Recupero D.;Spathoulas G.

2025-01-01

Abstract

Scheda breve

Scheda completa

Scheda completa (DC)

Citazioni

social impact

UNICA IRIS Institutional Research Information System

Evaluating Fine-Tuned LLMs for AI Text Detection

Murgia M.;Reforgiato Recupero D.;Spathoulas G.

2025-01-01

Abstract

Scheda breve Scheda completa Scheda completa (DC)

Informazioni

Citazioni

social impact

Conferma cancellazione

Scheda breve

Scheda completa

Scheda completa (DC)