The LE-2111 SPARKLE (Shallow Parsing and Knowledge extraction for Language Engineering) project is aimed at the automatic extraction of lexical and semantic information from textual corpora in order to improve the performances of NLP systems. In this paper we describe an algorithm for the extraction of subcategorization patterns for Italian verbs. The extraction procedure is carried out on the basis of an efficient and accurate analogy-based engine and pre- and post-filters based on simple linguistic constraints. Despite the simplicity of the analogy-based algorithm the amount of lost information is negligible, and precision and recall over a set of hand-crafted subcategorization patterns (namely those produced within the LE PAROLE project) is fairly high.

An efficient algorithm for the authomatic building of a lexicon from textual corpora

FEDERICI, STEFANO
1998-01-01

Abstract

The LE-2111 SPARKLE (Shallow Parsing and Knowledge extraction for Language Engineering) project is aimed at the automatic extraction of lexical and semantic information from textual corpora in order to improve the performances of NLP systems. In this paper we describe an algorithm for the extraction of subcategorization patterns for Italian verbs. The extraction procedure is carried out on the basis of an efficient and accurate analogy-based engine and pre- and post-filters based on simple linguistic constraints. Despite the simplicity of the analogy-based algorithm the amount of lost information is negligible, and precision and recall over a set of hand-crafted subcategorization patterns (namely those produced within the LE PAROLE project) is fairly high.
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11584/36598
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact