Set Expansion is the problem of automatically extending a given set of seed elements with objects of the same class, for instance {red, green, white} → {red, green, white, gray, yellow,...}. In this paper we address the problem in the challenging scenario of extending singletons, that is, sets with only one seed element. Differently from existing work, we do not assume the presence of markup, such as html lists, nor whatsoever ontology, indeed relying only on free (unstructured, plain) text. Despite the challenging problem, we show that the singleton expansion can be accomplished unsupervisedly by means of nearest neighbor search (NNS) over word embeddings. We further propose an algorithm that significantly improve the performance of NNS both for small and large (long tail) expansions, while maintaining the important quality of being language independent.

Unsupervised Singleton Expansion from Free Text

Maurizio Atzori
;
2018-01-01

Abstract

Set Expansion is the problem of automatically extending a given set of seed elements with objects of the same class, for instance {red, green, white} → {red, green, white, gray, yellow,...}. In this paper we address the problem in the challenging scenario of extending singletons, that is, sets with only one seed element. Differently from existing work, we do not assume the presence of markup, such as html lists, nor whatsoever ontology, indeed relying only on free (unstructured, plain) text. Despite the challenging problem, we show that the singleton expansion can be accomplished unsupervisedly by means of nearest neighbor search (NNS) over word embeddings. We further propose an algorithm that significantly improve the performance of NNS both for small and large (long tail) expansions, while maintaining the important quality of being language independent.
2018
978-153864407-2
File in questo prodotto:
File Dimensione Formato  
icsc18_singleton_expansion.pdf

Solo gestori archivio

Tipologia: versione editoriale (VoR)
Dimensione 183.21 kB
Formato Adobe PDF
183.21 kB Adobe PDF   Visualizza/Apri   Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11584/238981
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 5
  • ???jsp.display-item.citation.isi??? 1
social impact