To counter threats like disinformation, which are amplified by Large Language Models (LLMs) generating human-like text, robust detection systems are essential. Such systems need to be effective across diverse domains and resilient to adversarial manipulations like paraphrasing. We investigated the use of fine-tuned LLMs for AI-generated text detection, evaluating supervised fine-tuning (SFT) across varying domain coverage during training. These were tested by focusing on their generalization capabilities when confronted with entirely unseen data sources (domains and generator models) and their robustness against adversarial manipulations. The findings indicate this methodology is promising, achieving performance comparable or superior to current literature methods when employing multi-domain training.

Evaluating Fine-Tuned LLMs for AI Text Detection

Reforgiato Recupero D.
;
2025-01-01

Abstract

To counter threats like disinformation, which are amplified by Large Language Models (LLMs) generating human-like text, robust detection systems are essential. Such systems need to be effective across diverse domains and resilient to adversarial manipulations like paraphrasing. We investigated the use of fine-tuned LLMs for AI-generated text detection, evaluating supervised fine-tuning (SFT) across varying domain coverage during training. These were tested by focusing on their generalization capabilities when confronted with entirely unseen data sources (domains and generator models) and their robustness against adversarial manipulations. The findings indicate this methodology is promising, achieving performance comparable or superior to current literature methods when employing multi-domain training.
2025
9783032006417
9783032006424
Adversarial Manipulation
AI-Generated Text Detection
Domain Generalization
Large Language Models
Robustness
File in questo prodotto:
File Dimensione Formato  
A_novel_approach_for_AI_detection_content_based_on_fine_tuned_LLMs__LCNS___Workshop_ (1).pdf

embargo fino al 10/08/2026

Tipologia: versione post-print (AAM)
Dimensione 357.56 kB
Formato Adobe PDF
357.56 kB Adobe PDF   Visualizza/Apri   Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11584/480229
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? ND
social impact