To counter threats like disinformation, which are amplified by Large Language Models (LLMs) generating human-like text, robust detection systems are essential. Such systems need to be effective across diverse domains and resilient to adversarial manipulations like paraphrasing. We investigated the use of fine-tuned LLMs for AI-generated text detection, evaluating supervised fine-tuning (SFT) across varying domain coverage during training. These were tested by focusing on their generalization capabilities when confronted with entirely unseen data sources (domains and generator models) and their robustness against adversarial manipulations. The findings indicate this methodology is promising, achieving performance comparable or superior to current literature methods when employing multi-domain training.
Evaluating Fine-Tuned LLMs for AI Text Detection
Reforgiato Recupero D.
;
2025-01-01
Abstract
To counter threats like disinformation, which are amplified by Large Language Models (LLMs) generating human-like text, robust detection systems are essential. Such systems need to be effective across diverse domains and resilient to adversarial manipulations like paraphrasing. We investigated the use of fine-tuned LLMs for AI-generated text detection, evaluating supervised fine-tuning (SFT) across varying domain coverage during training. These were tested by focusing on their generalization capabilities when confronted with entirely unseen data sources (domains and generator models) and their robustness against adversarial manipulations. The findings indicate this methodology is promising, achieving performance comparable or superior to current literature methods when employing multi-domain training.| File | Dimensione | Formato | |
|---|---|---|---|
|
A_novel_approach_for_AI_detection_content_based_on_fine_tuned_LLMs__LCNS___Workshop_ (1).pdf
embargo fino al 10/08/2026
Tipologia:
versione post-print (AAM)
Dimensione
357.56 kB
Formato
Adobe PDF
|
357.56 kB | Adobe PDF | Visualizza/Apri Richiedi una copia |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


