UNICA IRIS Institutional Research Information System

This paper is about the Sardinian Medieval Corpus (SMC), the first linguistically annotated digital resource of Medieval Sardinian. The first part presents the textual and linguistic characteristics and discusses them in the light of the problems they pose for both manual and automatic annotation. The second part describes the development of the first computational tools for the analysis of Medieval Sardinian, on the word level (lemmatization and part-of-speech tagging) and on the syntactic level (dependency parsing). It is shown how the manual and the automatic approach can be combined to build an annotated database effeciently, even for medieval texts.

Word-level and higher level annotation of the Sardinian Medieval Corpus

Nicoletta Puddu;STEIN, ACHIM

2018-01-01

Abstract

This paper is about the Sardinian Medieval Corpus (SMC), the first linguistically annotated digital resource of Medieval Sardinian. The first part presents the textual and linguistic characteristics and discusses them in the light of the problems they pose for both manual and automatic annotation. The second part describes the development of the first computational tools for the analysis of Medieval Sardinian, on the word level (lemmatization and part-of-speech tagging) and on the syntactic level (dependency parsing). It is shown how the manual and the automatic approach can be combined to build an annotated database effeciently, even for medieval texts.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2018
			
	Codice ISBN
	
				9783901716430
			
	Parole chiave
	
				Historical corpora; Sardinian; Digital humanities
			
	Tipologia:
	
				4.1 Contributo in Atti di convegno

File in questo prodotto:

File	Dimensione	Formato
Puddu_Stein_CRH2.pdf Solo gestori archivio Tipologia: versione editoriale (VoR) Dimensione 222.17 kB Formato Adobe PDF Visualizza/Apri Richiedi una copia	222.17 kB	Adobe PDF	Visualizza/Apri Richiedi una copia

I metadati presenti in IRIS UNICA sono rilasciati con licenza Creative Commons CC0 1.0 Universal, mentre i file delle pubblicazioni sono protetti da diritto d'autore, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11584/234671

Citazioni

ND

ND

ND

ND

social impact