In 2012, the United Nations set 17 Sustainable Development Goals (SDGs) to build a better future by 2030, but monitoring progress is challenging due to data complexity. Recent Large Language Models (LLMs) have significantly improved Natural Language Processing tasks, including text classification. This study evaluates only open-weight LLMs for single-label, multi-class SDG text classification, comparing Zero-Shot, Few-Shot, and Fine-Tuning approaches. Our goal is to determine whether smaller, resource-efficient models, optimized through prompt engineering, can obtain competitive results on a challenging dataset. Using a benchmark dataset from the Open SDG initiative, our findings show that with effective prompt engineering, small models can significantly achieve competitive performance.
Benchmarking Large Language Models for Sustainable Development Goals Classification: Evaluating In-Context Learning and Fine-Tuning Strategies
De Leo V.;Fenu G.;reforgiato Recupero D.;Salatino A.;Secchi L.
2025-01-01
Abstract
In 2012, the United Nations set 17 Sustainable Development Goals (SDGs) to build a better future by 2030, but monitoring progress is challenging due to data complexity. Recent Large Language Models (LLMs) have significantly improved Natural Language Processing tasks, including text classification. This study evaluates only open-weight LLMs for single-label, multi-class SDG text classification, comparing Zero-Shot, Few-Shot, and Fine-Tuning approaches. Our goal is to determine whether smaller, resource-efficient models, optimized through prompt engineering, can obtain competitive results on a challenging dataset. Using a benchmark dataset from the Open SDG initiative, our findings show that with effective prompt engineering, small models can significantly achieve competitive performance.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


