UNICA IRIS Institutional Research Information System

In recent years, transformer-based models have emerged as powerful tools for natural language processing tasks, demonstrating remarkable performance in several domains. However, they still present significant limitations. These shortcomings become more noticeable when dealing with highly specific and complex concepts, particularly within the scientific domain. For example, transformer models have particular difficulties when processing scientific articles due to the domain-specific terminologies and sophisticated ideas often encountered in scientific literature. To overcome these challenges and further enhance the effectiveness of transformers in specific fields, researchers have turned their attention to the concept of knowledge injection. Knowledge injection is the process of incorporating outside knowledge into transformer models to improve their performance on certain tasks. In this paper, we present a comprehensive study of knowledge injection strategies for transformers within the scientific domain. Specifically, we provide a detailed overview and comparative assessment of four primary methodologies, evaluating their efficacy in the task of classifying scientific articles. For this purpose, we constructed a new benchmark including both 24K labelled papers and a knowledge graph of 9.2K triples describing pertinent research topics. We also developed a full codebase to easily re-implement all knowledge injection strategies in different domains. A formal evaluation indicates that the majority of the proposed knowledge injection methodologies significantly outperform the baseline established by Bidirectional Encoder Representations from Transformers.

A comparative analysis of knowledge injection strategies for large language models in the scholarly domain

Cadeddu A.;Chessa A.;De Leo V.;Fenu G.;Motta E.;Osborne F.;Reforgiato Recupero D.;Salatino A.;Secchi L.

2024-01-01

Abstract

In recent years, transformer-based models have emerged as powerful tools for natural language processing tasks, demonstrating remarkable performance in several domains. However, they still present significant limitations. These shortcomings become more noticeable when dealing with highly specific and complex concepts, particularly within the scientific domain. For example, transformer models have particular difficulties when processing scientific articles due to the domain-specific terminologies and sophisticated ideas often encountered in scientific literature. To overcome these challenges and further enhance the effectiveness of transformers in specific fields, researchers have turned their attention to the concept of knowledge injection. Knowledge injection is the process of incorporating outside knowledge into transformer models to improve their performance on certain tasks. In this paper, we present a comprehensive study of knowledge injection strategies for transformers within the scientific domain. Specifically, we provide a detailed overview and comparative assessment of four primary methodologies, evaluating their efficacy in the task of classifying scientific articles. For this purpose, we constructed a new benchmark including both 24K labelled papers and a knowledge graph of 9.2K triples describing pertinent research topics. We also developed a full codebase to easily re-implement all knowledge injection strategies in different domains. A formal evaluation indicates that the majority of the proposed knowledge injection methodologies significantly outperform the baseline established by Bidirectional Encoder Representations from Transformers.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno di pubblicazione
	
				2024
			
	Parole chiave
	
				BERT; Classification; Knowledge graphs; Knowledge injection; Large language models; Natural language processing; Transformers
			
	Tipologia:
	
				1.1 Articolo in rivista

File in questo prodotto:

File	Dimensione	Formato
A comparative analysis of knowledge injection strategies for large language models in the scholarly domain - 1-s2.0-S0952197624003245-main.pdf accesso aperto Tipologia: versione editoriale (VoR) Dimensione 2.16 MB Formato Adobe PDF Visualizza/Apri	2.16 MB	Adobe PDF	Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11584/426189

Citazioni

ND

19

10

social impact