This paper reports on the experience of developing and applyinga shallow parsing scheme (chunking) to unrestricted Italiantexts, with a view to automatic acquisition of lexical informationfrom corpora, and the prospective definition of further, morecomplex levels of syntactic analysis. The first part of the paper illustrates in detail the adopted annotation scheme, by relating itto more established linguistic notions and some specific issuesof Italian syntactic analysis. The second part of the paper focuses on a detailed evaluation of relevant issues such as:reliability of text chunking with finite state technology, usabilityof a chunked text as a source for automatic acquisition of lexicalinformation, amenability of the chunking scheme to further morecomplex levels of syntactic annotation
Analogy-based extraction of lexical knowledge from corpora: the SPARKLE experience
FEDERICI, STEFANO;
1998-01-01
Abstract
This paper reports on the experience of developing and applyinga shallow parsing scheme (chunking) to unrestricted Italiantexts, with a view to automatic acquisition of lexical informationfrom corpora, and the prospective definition of further, morecomplex levels of syntactic analysis. The first part of the paper illustrates in detail the adopted annotation scheme, by relating itto more established linguistic notions and some specific issuesof Italian syntactic analysis. The second part of the paper focuses on a detailed evaluation of relevant issues such as:reliability of text chunking with finite state technology, usabilityof a chunked text as a source for automatic acquisition of lexicalinformation, amenability of the chunking scheme to further morecomplex levels of syntactic annotationI documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.