This work is focused on the sailing domain, for which several innovative technologies are being adopted to improve sailing efficiency, performance, and safety. In this context a knowledge graph could be used, for example, to represent information about different types of boats, sailing techniques, maritime safety, or weather conditions. Although numerous construction methods or ready-to-go knowledge graphs have been proposed in many fields, the sailing domain still needs to be explored. As the most effective methods rely on domain-specific datasets, the absence of suitable and available sailing datasets is one of the main challenges. Although several Open Information Extraction (OpenIE) methods may generate relevant triplets (the elementary units composing a knowledge graph) from arbitrary text without any additional information about its topic, such methods usually generate many incorrect triplets. In this paper, we aim (i) to address the aforementioned problem by proposing an innovative method that combines in an improved and strengthened way different OpenIE tools to generate proper triplets from domain-specific sources and, in particular, (ii) to build and release a suitable dataset for the sailing domain. Results confirm that our proposal can maximize the extracted information and infer unique information irretrievable by the classical OpenIE tools and, furthermore, that the generated dataset is significantly valuable for the sailing scenario.
SailGenie: SAiling expertIse to knowLedge Graph through opEN Information Extraction
Carta, Salvatore;Giuliani, Alessandro;Piano, Leonardo;Podda, Alessandro Sebastian;Tiddia, Sandro Gabriele
2023-01-01
Abstract
This work is focused on the sailing domain, for which several innovative technologies are being adopted to improve sailing efficiency, performance, and safety. In this context a knowledge graph could be used, for example, to represent information about different types of boats, sailing techniques, maritime safety, or weather conditions. Although numerous construction methods or ready-to-go knowledge graphs have been proposed in many fields, the sailing domain still needs to be explored. As the most effective methods rely on domain-specific datasets, the absence of suitable and available sailing datasets is one of the main challenges. Although several Open Information Extraction (OpenIE) methods may generate relevant triplets (the elementary units composing a knowledge graph) from arbitrary text without any additional information about its topic, such methods usually generate many incorrect triplets. In this paper, we aim (i) to address the aforementioned problem by proposing an innovative method that combines in an improved and strengthened way different OpenIE tools to generate proper triplets from domain-specific sources and, in particular, (ii) to build and release a suitable dataset for the sailing domain. Results confirm that our proposal can maximize the extracted information and infer unique information irretrievable by the classical OpenIE tools and, furthermore, that the generated dataset is significantly valuable for the sailing scenario.File | Dimensione | Formato | |
---|---|---|---|
1-s2.0-S1877050923013716-main.pdf
accesso aperto
Tipologia:
versione editoriale (VoR)
Dimensione
642.7 kB
Formato
Adobe PDF
|
642.7 kB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.