This work is focused on the sailing domain, for which several innovative technologies are being adopted to improve sailing efficiency, performance, and safety. In this context a knowledge graph could be used, for example, to represent information about different types of boats, sailing techniques, maritime safety, or weather conditions. Although numerous construction methods or ready-to-go knowledge graphs have been proposed in many fields, the sailing domain still needs to be explored. As the most effective methods rely on domain-specific datasets, the absence of suitable and available sailing datasets is one of the main challenges. Although several Open Information Extraction (OpenIE) methods may generate relevant triplets (the elementary units composing a knowledge graph) from arbitrary text without any additional information about its topic, such methods usually generate many incorrect triplets. In this paper, we aim (i) to address the aforementioned problem by proposing an innovative method that combines in an improved and strengthened way different OpenIE tools to generate proper triplets from domain-specific sources and, in particular, (ii) to build and release a suitable dataset for the sailing domain. Results confirm that our proposal can maximize the extracted information and infer unique information irretrievable by the classical OpenIE tools and, furthermore, that the generated dataset is significantly valuable for the sailing scenario.

SailGenie: SAiling expertIse to knowLedge Graph through opEN Information Extraction

Carta, Salvatore;Giuliani, Alessandro;Piano, Leonardo;Podda, Alessandro Sebastian;Tiddia, Sandro Gabriele
2023-01-01

Abstract

This work is focused on the sailing domain, for which several innovative technologies are being adopted to improve sailing efficiency, performance, and safety. In this context a knowledge graph could be used, for example, to represent information about different types of boats, sailing techniques, maritime safety, or weather conditions. Although numerous construction methods or ready-to-go knowledge graphs have been proposed in many fields, the sailing domain still needs to be explored. As the most effective methods rely on domain-specific datasets, the absence of suitable and available sailing datasets is one of the main challenges. Although several Open Information Extraction (OpenIE) methods may generate relevant triplets (the elementary units composing a knowledge graph) from arbitrary text without any additional information about its topic, such methods usually generate many incorrect triplets. In this paper, we aim (i) to address the aforementioned problem by proposing an innovative method that combines in an improved and strengthened way different OpenIE tools to generate proper triplets from domain-specific sources and, in particular, (ii) to build and release a suitable dataset for the sailing domain. Results confirm that our proposal can maximize the extracted information and infer unique information irretrievable by the classical OpenIE tools and, furthermore, that the generated dataset is significantly valuable for the sailing scenario.
2023
Artificial Intelligence; Knowledge graphs; Sailing; Domain-specific dataset
File in questo prodotto:
File Dimensione Formato  
1-s2.0-S1877050923013716-main.pdf

accesso aperto

Tipologia: versione editoriale (VoR)
Dimensione 642.7 kB
Formato Adobe PDF
642.7 kB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11584/433807
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 1
  • ???jsp.display-item.citation.isi??? ND
social impact